Nvidia’s new tool lets you run GenAI models on a PC


Nvidia, always eager to encourage the purchase of its latest GPUs, is releasing a tool that allows owners of GeForce RTX 30 and 40 series cards to run an AI-powered chatbot offline on a Windows PC.

Called Chat with RTX, the tool allows users to customize a GenAI model along the lines of OpenAI's ChatGPT by connecting it to documents, files and notes that it can then query.

“Rather than searching through notes or saved content, users can simply enter queries,” Nvidia writes in a blog post. “For example, we could ask: “What restaurant does my partner recommend in Las Vegas?” » and Chat with RTX will scan the local files the user points it to and provide the pop-up response.

Chat with RTX uses the open source template from AI startup Mistral by default, but supports other text-based templates, including Meta's Llama 2. Nvidia warns that downloading all the necessary files will consume a fair amount of storage – 50 GB to 100 GB, depending on the model(s) selected.

Currently, Chat with RTX works with text, PDF, .doc, .docx and .xml formats. Pointing the application to a folder containing all supported files will load the files into the model's fine-tuning dataset. Additionally, Chat with RTX can use a YouTube playlist's URL to load transcripts of videos in the playlist, allowing the selected model to query its content.

Now, there are some limitations to keep in mind, which Nvidia, to its credit, outlines in a handy guide.

Image credits: Nvidia

Chat with RTX cannot remember context, which means the app will not consider any previous questions when answering follow-up questions. For example, if you ask “What is a common bird in North America?” “” and continue with “What are its colors?” », Chat with RTX won't know you're talking about birds.

Nvidia also recognizes that the relevance of application responses may be affected by a range of factors, some easier to control than others, including the wording of the questions, the performance of the selected model, and the size of the set of questions. fine-tuning data. Asking for facts covered in a few documents will likely yield better results than asking for a summary of a document or set of documents. And the quality of responses will generally improve with larger data sets, as will pointing Chat with RTX to more content on a specific topic, according to Nvidia.

Chat with RTX is therefore more of a toy than anything that can be used in production. There is, however, something to be said for applications that make it easier to run AI models locally – which is a growing trend.

In a recent report, the World Economic Forum predicted “dramatic” growth in affordable devices capable of running GenAI models offline, including PCs, smartphones, Internet of Things devices, and networking equipment. The reasons, according to WEF, are the obvious advantages: not only are offline models inherently more private (the data they process never leaves the device they run on), but they also have lower latency low and are more cost-effective than cloud-hosted models.

Of course, the democratization of tools to run and train models opens the door to malicious actors: a quick Google search yields numerous listings of models honed on toxic content from unscrupulous corners of the web. But proponents of apps like Chat with RTX say the pros outweigh the cons. We'll have to wait and see.



Source link

Scroll to Top