This repo is an example that sets up a devbox environment with Python and pip. It uses Fast API to set up API endpoints listed below and calls NVIDIA API to allow interactions with nemotron LLM model.
To setup this repo make sure to have devbox installed and have access to your NVIDIA's API key.
- Clone this repo
git clone https://github.com/jetify-examples/python-nvidia-nemotron.git
. cd python-nvidia-nemotron
and then rundevbox run install
- Copy your NVIDIA's API key in key in devbox.json's
"env"
section. - run
devbox run start
Once the server is setup and running, you can access to static page by visiting localhost:8080
The app also makes an API endpoint available to interact with NVIDIA's nemotron. The endpoint responds to POST requests to /api/prompt
. Below is an example request and response:
Example request:
curl --location 'http://127.0.0.1:8080/api/prompt' \
--header 'Content-Type: application/json' \
--data '{
"prompt": "What is the circumference of Earth?"
}'
Example response:
{
"message": {
"content": "The circumference of Earth is approximately 24,901 miles (40,075 kilometers).",
"refusal": null,
"role": "assistant",
"function_call": null,
"tool_calls": null
}
}
To use this project as a skeleton and develop on it, you can utilize the references for Fast API docs, more specifically, routes and handlers. As well as NVIDIA's API Reference.