Cohere Large Language Model in AWS Sagemaker

In this blog post we will discuss about Cohere Large Language Model in AWS Sagemaker platform.

Contents hide

1 Introduction

2 Cohere

2.1 Cohere Products and use Cases:

3 Amazon SageMaker

4 Cohere in SageMaker

4.1 Developers benefits

4.2 Deploy the SageMaker endpoint using a notebook:

4.2.1 1. Pre-requisites:

4.2.2 2. Subscribe to the model package

4.2.3 3. Import package and Initialization

4.2.4 4.Create an endpoint and perform real-time inference

4.2.5 5.Clean-up

Introduction

Large language models are artificial intelligence tools that open new possibilities of text understanding and generation. It can recognize, predict, and generate human languages ( generate sentences similar to how humans talk and write) on the basis of very large text-based data sets.

Large language models(LLMs) can be used for different types of tasks such as

text generation
question answering systems
machine translation
automatic summarization
document classification
sentiment analysis
code generation
etc.

List of Large language models and there size

Cohere

Cohere platform provides access to advanced Large Language Models and NLP tools through one easy-to-use API. It can be used to generate or predict text to do things like moderate content, extract information, classify data, write copy, all at a massive scale. Cohere trains on large language models and puts them behind a simple API. Cohere.ai offers generative language models (like GPT3 and GPT2) and representation language models (like BERT) types.

Cohere Generation provide two types of Model sizes: Medium, Extremely Large.
Cohere Representation provide three types of Model sizes: Small, Medium, Large.

Cohere Products and use Cases:

Cohere provides mainly three products which cover most of the nlp use cases.

1. Classify: Classify uses cutting-edge machine learning to analyze and bucket text into specific categories. Typical use cases:

Topic classification
Content moderation
Intent recognition
Support ticket routing
Sentiment analysis

2. Generate: The Generate API is powered by a large language model. Its trained on billions of words based on different types of topics and industries. Typical use cases:

Entity extraction
Article summarization
Writing blog posts
Writing ad creative
Writing product descriptions
Spelling, grammar correction

3. Embed: Embeddings are numerical representations of meaning in text, which can be used to compare each other based on their similarities. Typical use cases:

Topic modeling
Recommendations
Semantic search

Amazon SageMaker

Amazon SageMaker is a cloud fully managed machine learning service for data science and machine learning workflows. Data scientists and developers can quickly and easily build and train machine learning models and deploy them in Amazon SageMaker itself. For exploration and analysis, Amazon SageMaker provides an integrated Jupyter notebook instance. Amazon SageMaker also provides common machine learning algorithms( Linear Learner, BlazingText, CatBoost, DeepAR Forecasting, K-Means, LDA, LightGBM and etc) that are optimized to run the large data efficiently in a distributed environment.

SageMaker provides the following major features

Amazon Augmented AI
SageMaker Data Wrangler
SageMaker Ground Truth
SageMaker Autopilot
SageMaker Clarify
Batch Transform
SageMaker Debugger
SageMaker Edge Manager
SageMaker Experiments
SageMaker Feature Store
SageMaker Model Building Pipelines
SageMaker Model Monitor
SageMaker Neo
SageMaker Serverless Endpoints
SageMaker Studio Notebooks
etc

SageMaker provides the following machine learning environments.

SageMaker Studio
SageMaker Studio Lab
SageMaker Canvas
RStudio on Amazon SageMaker

Cohere in SageMaker

Cohere.ai’s state-of-the-art language AI is available through Amazon SageMaker. Now it has become easier for developers to deploy Cohere pre-trained generation language model to AWS SageMaker, an end-to-end machine learning service. Cohere focuses on the language. Developers can access Cohere’s Medium generation language model by using SageMaker. The Cohere’s Medium model is deployed in containers that enable low-latency inference on AWS. The Medium generation model can be used for question answering, copywriting, or paraphrasing.

Developers benefits

The developers will get three key benefits if they can use the Cohere Medium generation language model through SageMaker:

Build, iterate, and deploy quickly: Any developer (no ML, or AI and NLP expertise required) can quickly get access to Cohere pre-trained generation model that understands context and predict text at unprecedented levels. This model reduces the time-to-value for customers by providing an out-of-the-box solution for multiple tasks on language understanding.
Speed and accuracy: Cohere Medium generation model offers users a good balance between cost, quality,and latency. Using a simple API, We can integrate the Cohere Medium Generate endpoint into apps easily and it also provide SDK.
Private and secure: In SageMaker, data will be always secure because SageMaker is used with containers serving coheres models which uses self managed containers.

Deploy the SageMaker endpoint using a notebook:

Cohere is packaged with Medium generation language models, along with an optimized, low-latency inference framework, in containers that can be deployed as AWS SageMaker inference endpoints. The containers can be deployed on a range of different instances (including ml.g5.2xlarge ml.p3.2xlarge, and ml.g5.xlarge) that offer different performances/costs. Currently these containers are available in two Regions: us-east-1 and eu-west-1. Developers can start quickly by using this Jupyter notebooks which is provided by Cohere. You can also follow the below steps

1. Pre-requisites:

1. Required AmazonSageMakerFullAccess in IAM role
2. Ensure the below access
A. Either you have the access to make AWS Marketplace subscriptions and your IAM role has these three permissions in the AWS account
a. aws-marketplace:Subscribe
b. aws-marketplace:Unsubscribe
c. aws-marketplace:ViewSubscriptions
B. or your AWS account has a subscription to Cohere Medium model . If so, skip step: 2 steps.

2. Subscribe to the model package

a. Open the model package listing page Cohere Medium model .
b. Click on the Continue to subscribe button On the AWS Marketplace listing.
c. Click on “Accept Offer” if you grees with EULA, pricing, and support terms.
d. Click on Continue to configuration button and then select a region(us-east-1 and eu-west-1). You will see a Product ARN displayed. Copy the ARN which will used while creating a deployable model using aws Boto3.

3. Import package and Initialization

# install package
!pip install cohere-sagemaker

#Import package
from cohere_sagemaker import Client, CohereError
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role
import boto3
import numpy as np

Import package and Initialization for Cohere in SageMaker

4.Create an endpoint and perform real-time inference

Create an endpoint for Cohere in SageMaker

Writing a blog post with co.generate

Entity Extraction using co.generate

Delete the endpoint
You have successfully performed a real-time inference. You can terminate the endpoint to avoid being charged.

model.sagemaker_session.delete_endpoint(model_name)
model.sagemaker_session.delete_endpoint_config(model_name)

5.Clean-up

Delete the model

model.delete_model()

Unsubscribe to the listing
Follow the below steps for unsubscribe to product from AWS Marketplace:

Go to Machine Learning (ML) tab on Your Software subscriptions page
Locate subscription which you want to cancel form the listing, and then Select Cancel Subscription to cancel the subscription.

If you want to know about 4 best open source ocr library please check this link