The POST api/v1/datasets/{datasetId}/datasources and POST /v1/datasources endpoints allow you to upload your data in two methods:

  • From a public URL: Use the url parameter.

  • From your local storage: Use the fileKey parameter.

This guide focuses on the latter—uploading a local file to create a data source.


Authentication

All requests to Powerdrill must include an x-pd-api-key header with your API key.

To get your API key, see Quick Start.


Step 1. Upload your local file and obtain the fileKey

Use the POST /v1/file/upload_datasource endpoint to upload your local file. After the upload, the response will include a fileKey. Save this fileKey to create a data source using the Create data source endpoint.

Supported file formats include: .csv, .tsv, .md, .mdx, .json, .txt, .pdf, .pptx, .ppt, .doc, .docx, .xls, and .xlsx.

Here’s an example in cURL:

Example request:
curl --location 'http://ai.data.cloud/api/v1/file/upload_datasource' \
--header 'Content-Type: multipart/form-data' \
--header 'x-pd-api-key: <api-key>' \
--header 'x-pd-api-agent-id: GENERAL' \
--form 'file=@"<file_path>"'

When making a request:

  • Replace <api-key> with your actual API key.

  • Replace <file_path> with the actual full path to your file, for example, /Users/test/workspace/sales_2024.csv.

Here’s an example response:

Example response:
{
  "code": 0,
  "data": {
    "fileKey": "/tmp/sdgsagdsgsadgasdg"
  }
}

Extract the fileKey value from the response for later use. In this example, it is /tmp/sdgsagdsgsadgasdg.


Step 2. Create a data source

Now, create a data source to embed and synchronize it with your AI Workspace.

You can do this by sending a request to either the POST api/v1/datasets/{datasetId}/datasources endpoint or the POST /v1/datasources endpoint.

The example below demonstrates the use of the POST /v1/datasources endpoint.

Example request:
curl --location 'https://ai.data.cloud/api/v1/datasets/cm3my37en3q36017q7x3hyyf4/datasources' \
  --header 'x-pd-api-key: $PD_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{
  "name": "test.csv",
  "fileName": "test.csv",
  "type": "FILE",
  "fileKey": "/tmp/sdgsagdsgsadgasdg"
}'

When making a request:

  • Replace <api-key> with your actual API key.
  • Set fileKey to your actual one.
Example response:
{
  "code": 0,
  "data": {
    "id": "datasource-cadsgfsdagasgadsg",
    "datasetId": "dataset-dagasdgasgasg",
    "name": "test.csv",
    "fileName": "test.csv",
    "type": "FILE",
    "status": "pending"
  }
}

Obtain the id (data source ID) and datasetId (dataset ID) for later use. In this example, the data source ID is datasource-cadsgfsdagasgadsg and the dataset ID is dataset-dagasdgasgasg.


Step 3. Check data source status

The data source must be in the synched state to be ready for use. To check its status, use the GET v1/datasets/{datasetId}/datasources/{datasourceId} endpoint.

Here’s an example:

Example request
curl --request GET \
  --url https://ai.data.cloud/api/v1/datasets/dataset-dagasdgasgasg/datasources/datasource-cadsgfsdagasgadsg \
  --header 'x-pd-api-key: <api-key>'

Check the status in the response:

Example response:
{
  "code": 0,
  "data": {
    "id": "datasource-cadsgfsdagasgadsg",
    "datasetId": "dataset-dagasdgasgasg",
    "name": "test.csv",
    "fileName": "test.csv",
    "type": "FILE",
    "status": "synched"
  }
}

In this example, the data source is in the synched state and is ready for use in data analysis jobs.

Other possible statuses include:

  • pending: Waiting to be processed.
  • running: Currently being processed.
  • error: Processing failed.

If the status is pending or running, wait for some time and check again until it changes to synched.

If the status is error, you’ll need to re-upload the file by starting from Step 1.