Welcome to the Artifact Quickstart Guide! Here you will learn how to create your own RAG knowledge base by building a Catalog, and how to use the Ask Catalog API.
Artifact automates unstructured data ETL by transforming raw files into a Catalog - a unified AI-ready format. A Catalog serves as a sophisticated Knowledge Base that ensures your data has been effectively processed, curated, and prepared for all of your future AI and Retrieval-Augmented Generation (RAG) needs.
For this guide, we will be using Shakespeare's play, A Midsummer Night's Dream, as our sample document from Project Gutenberg.
#Prerequisites
- Launch your own self-hosted Instill Core instance locally by following the
steps on the previous page, or, if you are using Instill
Core as a managed service, click here to
login.
- For self-hosted Instill Core users, since the embedding feature in Artifact will be required, don't forget to add an OpenAI Secret Key to your deployment's configuration.
- Download the play as a
.pdf
file from this link. Ensure that it saves to your local machine with the namemidsummer-nights-dream.pdf
. - Create an API Token by following the steps in the API Token Management guide.
In the following steps, you will see HOST_URL
used as a placeholder. If you
are using Instill Core as a managed service, set HOST_URL
to
https://private.instill-ai.com
. If you are self-hosting Instill Core,
use http://localhost:8080
.
#Create Catalog via Console
Creating an AI-ready Catalog consists of three steps: Create Catalog, Upload Files, and Process Files. The fastest way to get started with Artifact is via the Console UI. Let's get started!
#Create Catalog
To create a new empty Catalog, follow these steps:
- Open up the Console via local Instill Core deployment at
http://localhost:3000, or by selecting the
Go to console
button in the bottom-left of the Instill Agent interface if you are using Instill Core as a managed service. - Navigate to the Artifacts page using the navigation bar
- Click the
+ Create Catalog
button - Select the
Owner
(namespace) - Enter 'shakespeare' as the name for your Catalog
- Enter 'Works of Shakespeare' as the description for your Catalog
- Click the
Create
button
#Upload Files
To upload the midsummer-nights-dream.pdf
file to the 'shakespeare' Catalog,
follow these steps:
- Click the 'shakespeare' Catalog card
- Select
Upload Documents
in the left panel - Drag and drop your
midsummer-nights-dream.pdf
file into the blue box or clickbrowse computer
to locate and upload it
#Process Files
To process midsummer-nights-dream.pdf
into chunks and corresponding
embeddings, simply click the Process Files
button. The processing status
appears in the Files tab.

When the status is Completed
, you can view your parsed Files
and Chunks.

#Create Catalog via the API
Creating an AI-ready Catalog consists of three steps: Create Catalog, Upload Files, and Process Files.
#Create Catalog
To create a new empty Catalog, follow these steps:
- Generate an
INSTILL_API_TOKEN
by going to Console > Settings > API Tokens or following the steps here - Copy and paste the following snippet into your Python script or terminal,
replacing the
********
with your API token, andNAMESPACE_ID
with your namespace ID. Also, replaceHOST_URL
with your Instill Core endpoint (see note above). - Hit enter to create the Catalog
#Example Response
A successful response will return details about the newly created Catalog:
{ "catalog": { "catalogId": "shakespeare", "name": "shakespeare", "description": "Works of Shakespeare", "createTime": "2024-09-03 22:19:32.77246 +0000 UTC", "updateTime": "2024-09-03 22:19:32.774342789 +0000 UTC", "ownerName": "a7219ce0-4c6c-4dd5-8ac5-1fbf87aedd4a", "tags": [], "convertingPipelines": [ "preset/indexing-convert-pdf" ], "splittingPipelines": [ "preset/indexing-split-text", "preset/indexing-split-markdown" ], "embeddingPipelines": [ "preset/indexing-embed" ], "downstreamApps": [], "totalFiles": 0, "totalTokens": 0, "usedStorage": "0" }}
#Upload Files
To upload the midsummer-nights-dream.pdf
file to the shakespeare
Catalog,
follow these steps:
- Navigate within your terminal to the directory where the
midsummer-nights-dream.pdf
file is located - Copy and paste the following snippet into your terminal, replacing the
********
with your API token, andNAMESPACE_ID
with your namespace ID. Also, remember to replaceHOST_URL
with your Instill Core endpoint (see note above). - Hit enter to upload the file
#Example Response
A successful response will return details about the newly uploaded file:
{ "file": { "fileUid": "cd750b8f-5769-407c-afc0-955424387863", "name": "midsummer-nights-dream.pdf", "type": "FILE_TYPE_PDF", "processStatus": "FILE_PROCESS_STATUS_NOTSTARTED", "processOutcome": "", "retrievable": false, "content": "", "ownerUid": "a7219ce0-4c6c-4dd5-8ac5-1fbf87aedd4a", "creatorUid": "a7219ce0-4c6c-4dd5-8ac5-1fbf87aedd4a", "catalogUid": "593e376c-87c5-49d1-9135-1e88eb5ee847", "createTime": "2024-09-03T22:20:29.025902Z", "updateTime": "2024-09-03T22:20:29.026284095Z", "deleteTime": null, "size": "1207112", "totalChunks": 0, "totalTokens": 0 }}
#Process Files
To process the uploaded file into chunks and corresponding embeddings, follow these steps:
- Copy and paste the following snippet into your terminal, replacing the
********
with your API token. - Replace
FILE_UID
with the file UID from the previous Upload Files response - Hit enter to process the file into AI-ready Catalog
#Example Response
A successful response will show that the files processStatus
has changed to
FILE_PROCESS_STATUS_WAITING
. This means the file has been scheduled for
processing.
Once the files have been processed, you can now:
- Obtain the parsed Markdown using the Get Single Source-of-Truth API
- Obtain the processed chunks using the View Chunks API
- Perform vector-based semantic search to retrieve relevant chunks for a query using the Retrieve Chunks API
- Perform a simple RAG inference via the Ask Catalog API (see below!)
#Perform RAG via the Ask Catalog API
With the Ask Catalog API, you can easily test and explore the knowledge contained within your new Catalog. Try it yourself using the steps below:
- Copy and paste the following snippet into your terminal, replacing the
********
with your API token, andNAMESPACE_ID
with your namespace ID. - Hit enter to ask your Catalog a question
Explore your Catalog and test it's knowledge by asking your own questions about the play! For instance, you could also try asking "What role do the fairies play?".
#Example Response
A successful response will return the answer to the question, along with a list of similar chunks from the Catalog that the LLM used to generate the answer.
{ "answer": "The main characters involved in the love triangle in Act I are Hermia, Lysander, and Demetrius.", "similarChunks": [ { "chunkUid": "6fe30865-731f-4524-958b-5f12a6ab53a4", "similarityScore": 0.5944186, "textContent": "\n\n\n## ACT I \n\n\n### SCENE I. Athens. A room in the Palace of Theseus \npower I am made bold, Nor how it may concern my modesty In such a presence here to plead my thoughts: But I beseech your Grace that I may know The worst that may befall me in this case, If I refuse to wed Demetrius. THESEUS. Either to die the death, or to abjure For ever the society of men. Therefore, fair Hermia, question your desires, Know of your youth, examine well your blood, Whether, if you yield not to your father’s choice, You can endure the livery of a nun, For aye to be in shady cloister mew’d, To live a barren sister all your life, Chanting faint hymns to the cold fruitless moon. Thrice-blessèd they that master so their blood To undergo such maiden pilgrimage, But earthlier happy is", "sourceFile": "midsummer-nights-dream.pdf" }, { "chunkUid": "838cf477-290c-4484-8f17-78be26bcd946", "similarityScore": 0.56566054, "textContent": "\n\n\n## ACT I \n\n\n### SCENE I. Athens. A room in the Palace of Theseus \n[ _Exeunt all but_ LYSANDER _and_ HERMIA_._ ] LYSANDER. How now, my love? Why is your cheek so pale? How chance the roses there do fade so fast? HERMIA. Belike for want of rain, which I could well Beteem them from the tempest of my eyes. LYSANDER. Ay me! For aught that I could ever read, Could ever hear by tale or history, The course of true love never did run smooth. But either it was different in blood— HERMIA. O cross! Too high to be enthrall’d to low. LYSANDER. Or else misgraffèd in respect of years— HERMIA. O spite! Too old to be engag’d to young. LYSANDER. Or else it stood upon the choice of friends— HERMIA. O hell! to choose love by another’s eyes! LYSANDER. Or, if there were a sympathy", "sourceFile": "midsummer-nights-dream.pdf" }, { "chunkUid": "49c79ce7-e2a6-4f24-aa5f-12252911e382", "similarityScore": 0.5614925, "textContent": "\n## ACT III \n\n\n### SCENE I. The Wood. \nI pray thee, gentle mortal, sing again. Mine ear is much enamour’d of thy note. So is mine eye enthrallèd to thy shape; And thy fair virtue’s force perforce doth move me, On the first view, to say, to swear, I love thee. BOTTOM. Methinks, mistress, you should have little reason for that. And yet, to say the truth, reason and love keep little company together nowadays. The more the pity that some honest neighbours will not make them friends. Nay, I can gleek upon occasion. TITANIA. Thou art as wise as thou art beautiful. BOTTOM. Not so, neither; but if I had wit enough to get out of this wood, I have enough to serve mine own turn. TITANIA. Out of this wood do not desire to go. Thou shalt remain here whether thou wilt or no.", "sourceFile": "midsummer-nights-dream.pdf" }, { "chunkUid": "25955be1-9f4a-4acd-a9a9-a337b0ed73f5", "similarityScore": 0.5575189, "textContent": "\n## ACT III \n\n### SCENE II. Another part of the wood \n\n###### HELENA. \nwhat, my love, shall I compare thine eyne? Crystal is muddy. O how ripe in show Thy lips, those kissing cherries, tempting grow! That pure congealèd white, high Taurus’ snow, Fann’d with the eastern wind, turns to a crow When thou hold’st up thy hand. O, let me kiss This princess of pure white, this seal of bliss! HELENA. O spite! O hell! I see you all are bent To set against me for your merriment. If you were civil, and knew courtesy, You would not do me thus much injury. Can you not hate me, as I know you do, But you must join in souls to mock me too? If you were men, as men you are in show, You would not use a gentle lady so; To vow, and swear, and superpraise my parts, When I am sure you", "sourceFile": "midsummer-nights-dream.pdf" }, { "chunkUid": "d5f623aa-0985-4b48-aee5-a8f90e16a1f6", "similarityScore": 0.5506241, "textContent": "\n## ACT IV \n\n\n### SCENE I. The Wood \n\n###### TITANIA. \ntheir horns. _Horns, and shout within._ DEMETRIUS, LYSANDER, HERMIA _and_ HELENA _wake and start up._ Good morrow, friends. Saint Valentine is past. Begin these wood-birds but to couple now? LYSANDER. Pardon, my lord. _He and the rest kneel to_ THESEUS_._ THESEUS. I pray you all, stand up. I know you two are rival enemies. How comes this gentle concord in the world, That hatred is so far from jealousy To sleep by hate, and fear no enmity? LYSANDER. My lord, I shall reply amazedly, Half sleep, half waking; but as yet, I swear, I cannot truly say how I came here. But, as I think (for truly would I speak) And now I do bethink me, so it is: I came with Hermia hither. Our intent Was to be gone from Athens, where", "sourceFile": "midsummer-nights-dream.pdf" } ]}
#Next Steps
- Learn how Artifact processes your files into a unified AI-ready format
- Obtain your parsed Markdown and processed chunks using the Get Single Source-of-Truth and View Chunks APIs
- Experiment with the Retrieve Chunks API to find semantically similar text chunks in your Catalog
- Build custom AI Pipelines with your catalog by connecting to it via the Artifact Data Component
#See Our Examples
Explore, test, modify and draw inspiration from the diverse range of AI products you can build with our services on our Examples page. This includes:
- Pipeline pipelines that are API-ready for external integrations
- Servable models that are ready to be deployed on Model
- Tutorials that give you step-by-step guidance on how to build your own AI applications
- Instill AI Cookbooks that demonstrate how to solve real-world problems with our Python SDK
#Read Our Blog
Stay up-to-date with our latest product updates, AI insights, and tutorials by visiting our Blog.
#Support
Please see our Support page for more information on how to get help.