Tutorials
QuickStart with HTML files
Get started quickly with HTML files
You can get started quickly with HTML files that you download from the website.
This can be particularly useful if you are looking to download a website’s contents and provide a natural language interface on top.
Installing via Python
You can install Twilix's BlitzChain package in Python running the following pip command:
pip install -U blitzchain
Once you have installed it in Python, you need to insert them. You can get your API key from app.twilix.io.
API_KEY = "YOUR_API_KEY"
from blitzchain import Client
client = Client(API_KEY)
collection = client.Collection("htmlExample")
Processing
You can then insert a locally saved HTML file using just a few lines of code.
html_file = 'example.html'
with open(html_file) as f:
html_content = f.read()
Inserting Data
Twilix provides support for inserting complex data types like HTML. We handle parsing, splitting, indexing.
# This runs in the background in our servers where we handle splitting, parsing for you
# The metadata is also stored alongside the document and is flattened
# We automatically extract the title for you if you only provide the html
collection.insert_html(html_content, metadata={"url": "https://example.com/index/", "insert_date": "22-04-21"})
# Titles are used to provide clean reference titles
# You can insert a title using the code below
collection.insert_html(
html_content,
metadata={"url": "https://example.com/index/", "insert_date": "22-04-21"},
title="Sample Index HTML"
)
You will get:
{'success': True, 'results': []}
You can then check the size of your collection to ensure it has been properly inserted.
collection.count()
This will return something similar to:
{'count': 135}
Launching dashboard
You can then launch the dashboard using the following
collection.launch_dashboard(
name="Example Website",
description="Source: https://example.com/index/"
)