In this blog post we will discuss about Cohere Large Language Model in AWS Sagemaker platform.
Introduction
Large language models are artificial intelligence tools that open new possibilities of text understanding and generation. It can recognize, predict, and generate human languages ( generate sentences similar to how humans talk and write) on the basis of very large text-based data sets.
Large language models(LLMs) can be used for different types of tasks such as
- text generation
- question answering systems
- machine translation
- automatic summarization
- document classification
- sentiment analysis
- code generation
- etc.
List of Large language models and there size
Cohere
Cohere platform provides access to advanced Large Language Models and NLP tools through one easy-to-use API. It can be used to generate or predict text to do things like moderate content, extract information, classify data, write copy, all at a massive scale. Cohere trains on large language models and puts them behind a simple API. Cohere.ai offers generative language models (like GPT3 and GPT2) and representation language models (like BERT) types.
Cohere Generation provide two types of Model sizes: Medium, Extremely Large.
Cohere Representation provide three types of Model sizes: Small, Medium, Large.
Cohere Products and use Cases:
Cohere provides mainly three products which cover most of the nlp use cases.
1. Classify: Classify uses cutting-edge machine learning to analyze and bucket text into specific categories. Typical use cases:
- Topic classification
- Content moderation
- Intent recognition
- Support ticket routing
- Sentiment analysis
2. Generate: The Generate API is powered by a large language model. Its trained on billions of words based on different types of topics and industries. Typical use cases:
- Entity extraction
- Article summarization
- Writing blog posts
- Writing ad creative
- Writing product descriptions
- Spelling, grammar correction
3. Embed: Embeddings are numerical representations of meaning in text, which can be used to compare each other based on their similarities. Typical use cases:
- Topic modeling
- Recommendations
- Semantic search
Amazon SageMaker
Amazon SageMaker is a cloud fully managed machine learning service for data science and machine learning workflows. Data scientists and developers can quickly and easily build and train machine learning models and deploy them in Amazon SageMaker itself. For exploration and analysis, Amazon SageMaker provides an integrated Jupyter notebook instance. Amazon SageMaker also provides common machine learning algorithms( Linear Learner, BlazingText, CatBoost, DeepAR Forecasting, K-Means, LDA, LightGBM and etc) that are optimized to run the large data efficiently in a distributed environment.
SageMaker provides the following major features
- Amazon Augmented AI
- SageMaker Data Wrangler
- SageMaker Ground Truth
- SageMaker Autopilot
- SageMaker Clarify
- Batch Transform
- SageMaker Debugger
- SageMaker Edge Manager
- SageMaker Experiments
- SageMaker Feature Store
- SageMaker Model Building Pipelines
- SageMaker Model Monitor
- SageMaker Neo
- SageMaker Serverless Endpoints
- SageMaker Studio Notebooks
etc
SageMaker provides the following machine learning environments.
- SageMaker Studio
- SageMaker Studio Lab
- SageMaker Canvas
- RStudio on Amazon SageMaker
Cohere in SageMaker
Cohere.ai’s state-of-the-art language AI is available through Amazon SageMaker. Now it has become easier for developers to deploy Cohere pre-trained generation language model to AWS SageMaker, an end-to-end machine learning service. Cohere focuses on the language. Developers can access Cohere’s Medium generation language model by using SageMaker. The Cohere’s Medium model is deployed in containers that enable low-latency inference on AWS. The Medium generation model can be used for question answering, copywriting, or paraphrasing.
Developers benefits
The developers will get three key benefits if they can use the Cohere Medium generation language model through SageMaker:
Build, iterate, and deploy quickly: Any developer (no ML, or AI and NLP expertise required) can quickly get access to Cohere pre-trained generation model that understands context and predict text at unprecedented levels. This model reduces the time-to-value for customers by providing an out-of-the-box solution for multiple tasks on language understanding.
Speed and accuracy: Cohere Medium generation model offers users a good balance between cost, quality,and latency. Using a simple API, We can integrate the Cohere Medium Generate endpoint into apps easily and it also provide SDK.
Private and secure: In SageMaker, data will be always secure because SageMaker is used with containers serving coheres models which uses self managed containers.
Deploy the SageMaker endpoint using a notebook:
Cohere is packaged with Medium generation language models, along with an optimized, low-latency inference framework, in containers that can be deployed as AWS SageMaker inference endpoints. The containers can be deployed on a range of different instances (including ml.g5.2xlarge ml.p3.2xlarge, and ml.g5.xlarge) that offer different performances/costs. Currently these containers are available in two Regions: us-east-1 and eu-west-1. Developers can start quickly by using this Jupyter notebooks which is provided by Cohere. You can also follow the below steps
1. Pre-requisites:
1. Required AmazonSageMakerFullAccess in IAM role
2. Ensure the below access
A. Either you have the access to make AWS Marketplace subscriptions and your IAM role has these three permissions in the AWS account
a. aws-marketplace:Subscribe
b. aws-marketplace:Unsubscribe
c. aws-marketplace:ViewSubscriptions
B. or your AWS account has a subscription to Cohere Medium model . If so, skip step: 2 steps.
2. Subscribe to the model package
a. Open the model package listing page Cohere Medium model .
b. Click on the Continue to subscribe button On the AWS Marketplace listing.
c. Click on “Accept Offer” if you grees with EULA, pricing, and support terms.
d. Click on Continue to configuration button and then select a region(us-east-1 and eu-west-1). You will see a Product ARN displayed. Copy the ARN which will used while creating a deployable model using aws Boto3.
3. Import package and Initialization
# install package
!pip install cohere-sagemaker#Import package
from cohere_sagemaker import Client, CohereError
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role
import boto3
import numpy as np
4.Create an endpoint and perform real-time inference
Writing a blog post with co.generate
Entity Extraction using co.generate
Delete the endpoint
You have successfully performed a real-time inference. You can terminate the endpoint to avoid being charged.
model.sagemaker_session.delete_endpoint(model_name)
model.sagemaker_session.delete_endpoint_config(model_name)
5.Clean-up
Delete the model
model.delete_model()
Unsubscribe to the listing
Follow the below steps for unsubscribe to product from AWS Marketplace:
- Go to Machine Learning (ML) tab on Your Software subscriptions page
- Locate subscription which you want to cancel form the listing, and then Select Cancel Subscription to cancel the subscription.
If you want to know about 4 best open source ocr library please check this link
2 thoughts on “Cohere Large Language Model in AWS Sagemaker”