Pipeline defines several Instill Types as data type identifiers. These
types simplify the creation of pipelines by eliminating the complexity of
converting unstructured data formats. The supported Instill Types include
primitive data types such as boolean
, string
, integer
, number
, and
json
, as well as unstructured data types like file
, document
, image
,
audio
, and video
, along with array data types.
These Instill Types enable users to efficiently build pipelines that manage unstructured data in ETL workflows.
#Instill Types
Pipeline extends Instill Type from JSON primitive types and MIME types (IANA media types).
#Primitive Data Types
string
number
integer
(Coming soon)boolean
json
#Unstructured Data Types
MIME types are defined as <type>/<subtype>
, and Pipeline extends this to
categorize unstructured data into five types:
image
: e.g.,image/jpeg
video
: e.g.,video/h264
audio
: e.g.,audio/wav
document
: e.g.,text/html
,application/pdf
file
: e.g.,application/octet-stream
- Please note that
file
is a generic type for any file, and the other types (text
,image
,video
,audio
, anddocument
) are specialized file types with specific format handling capabilities.
- Please note that
#Auto-Conversion
With Instill Type, users don't need to manually handle type conversion. For
instance, if a pipeline accepts a PNG image and a component requires JPEG, they
can be directly connected as long as they share the same type image
.
Example:
variable: image: title: A PNG Image type: image # User uploads an imagecomponent: ai-0: type: openai task: TASK_TEXT_GENERATION input: prompt: What is in the image? prompt-images: - ${variable.image} # Component requires input in image type
Auto-conversion not only works within the same type but also supports cross-type
conversions. For example, Pipeline can automatically convert a PDF document
into text/markdown
type.
Example:
variable: pdf: title: A PDF Document type: document # User uploads a PDF filecomponent: openai-0: type: openai task: TASK_TEXT_GENERATION input: prompt: ${variable.pdf} # Component requires text input
Supported Cross-Type Conversions:
document
→text
This feature allows users to focus on building business models without worrying about data type conversions, resulting in cleaner, more efficient pipelines.
#Attribute Extraction
Beyond type representation, Instill Type provides attribute extraction capabilities. Supported attribute includes:
Type | Attribute | Description |
---|---|---|
All Files | :file-size | Size of the file in bytes |
All Files | :filename | Name of the file |
All Files | :content-type | Content type of the file |
All Files | :data-uri | Data URI representation of the file |
All Files | :base64 | Base64 representation of the file |
Image | :width | Width of the image |
Image | :height | Height of the image |
Video | :duration | Duration of the video |
Video | :width | Width of the video |
Video | :height | Height of the video |
Video | :frame-rate | Frame rate of the video |
Audio | :duration | Duration of the audio |
Audio | :sample-rate | Sample rate of the audio |
Example: Accessing the width and height of an image:
variable: image: title: A PNG Image type: image # User uploads an image in PNG typeoutput: bounding-boxes: title: Bounding Boxes value: ${ai-0.output.bounding-boxes} height: title: Image Height value: ${variable.image:height} # Get the image height width: title: Image Width value: ${variable.image:width} # Get the image widthcomponent: ai-0: type: instill-model task: TASK_DETECTION input: image: ${variable.image}