Initial Database Connection

The Chata.ai teams train AutoQL language models by connecting to a database and leveraging the database structure to generate database-specific corpora. Since models are trained on the schema of the database rather than the data, a few options are available for clients to share their database structure with Chata.ai:

Option #1 - Whitelist a development environment

This is the optimal way of creating a model because the models generated can be tested against data that is known by the customer. An added advantage is that changes to the database structure can be easily added to the language model.

In this case, Chata.ai supplies IP addresses to the customer so they can be added to their firewall and routed to the database. A read-user with access to the Information Schema also needs to be created for the database.

Option #2 - Database backup

A database backup can be sent to Chata.ai so it can be recovered in Chata.ai’s cloud environment. This is a viable option to create a high-quality model since the database schema is fully available.

Option #3 - Data dump

In some instances, a dump of data via CSV files can be sent to Chata.ai. While this can be used for proofs of concept, it’s less optimal for creating the best models, as it less-closely matches a real database connection.

Once the data or data schema is received, the Chata.ai integration team uses a proprietary system to generate the language model and make it available to customers for testing and validation. Once the language model is thoroughly tested, it gets published into the customers’ cloud storage. This makes it available to be assigned to a database connection created in the Enterprise Portal.