Creating & Downloading Documents in Bulk
In this guide we’ll discover two ways of creating documents in bulk:
- Using the Create Multiple Documents API
- Using the Vault Loader Command Line Interface (CLI).
We’ll also look at how to download documents in bulk.
Creating multiple documents is a two-step process.
- First, upload the source files to your Vault’s File Staging with Vault File Manager, Loader CLI, or using the Vault API’s file staging endpoints.
- Then, use the Create Multiple Documents API or Load Data Objects endpoint to set the data attributes (field values) of the documents. These API calls also copy files from File Staging to your Vault.
Vault Loader CLI
Section link for Vault Loader CLIYou can download Vault Loader CLI from the Vault UI. Learn more about Vault Loader CLI in Vault Help
Create Documents
Section link for Create DocumentsIn the API call, you must provide values for all required attributes, for each document that you’re creating. Learn more about retrieving the required attributes in Understanding Metadata.
The request below creates three documents from source files doc1.txt, doc2.txt, and doc3.txt. The required metadata attributes for these documents are name__v, type__v and lifecycle__v.
The file column and suppressRendition are not required:
- The
fileattribute tells Vault where to look for the source file in File Staging. Omitting this column or leaving the value blank creates document placeholders. - The
suppressRenditionattribute tells Vault to delay generating a viewable rendition for the document until a user views the document from the UI. This setting reduces the load on your Vault by preventing unnecessary processing.
CSV Input
Section link for CSV InputBoth of the requests below use the create_documents.csv input file.
file,name__v,type__v,lifecycle__v,suppressRendition
doc1.txt,doc1,type,eTMF Lifecycle,true
doc2.txt,doc2,type,eTMF Lifecycle,true
doc3.txt,doc3,type,eTMF Lifecycle,trueRequest
Section link for RequestUsing curl:
curl -X POST -H "Authorization: {SESSION_ID}" \
-H "Content-Type: text/csv" \
-H "Accept: text/csv" \
--data-binary @"C:\Vault\Documents\create_documents.csv" \
https://myvault.veevavault.com/api/v15.0/objects/documents/batchUsing the command line interface:
java -jar VaultDataLoader.jar -createdocuments -csv create_documents.csvResponse
Section link for ResponseYou can see from the response that all documents were successfully created. The id column returns the system ID generated for each document and row_id maps each row back to the original input.
responseStatus,id,name__v,type__v,lifecycle__v,errors,row_id
SUCCESS,1634,,,,,1
SUCCESS,1635,,,,,2
SUCCESS,1636,,,,,3CLI Async Mode
Section link for CLI Async ModeYou can run the CLI asynchronously and free up the command prompt. The CLI gives you a job ID that you can use to track the job status. When the job is complete, you can use -jobresults to download the results, which are similar to the CSV above. In async mode, these don’t download automatically.
Example: CLI Async
Section link for Example: CLI AsyncThese examples show the same document creation process as above, but using the CLI in asynchronous mode:
>> java -jar VaultDataLoader.jar -createdocuments -csv createdoc.csv -async
Job 10479 submitted.
>> java -jar VaultDataLoader.jar -jobstatus 10479
ID Type Status Progress Start Time End Time Expiration Date
10479 create-documents Success 3 / 0 2016-11-28 2016-11-28 2016-12-14
02:23:49AM GMT 02:23:53AM GMT 02:23:53AM GMT
>> java -jar VaultDataLoader.jar -jobresults 10479Downloading Documents
Section link for Downloading DocumentsVault Loader (UI or CLI) provides the most efficient way to download source files and renditions. This section describes how to accomplish this from the CLI.
Download Location
Section link for Download LocationVault Loader downloads source files and renditions to File Staging, not your local machine. Once the process is complete, you can download the files from File Staging with Vault File Manager, Loader CLI, or the Vault API. The results indicate where the files reside on the server and which versions were included in the extract.
Vault uses the following directory structure for downloaded files:
{job_id}/{doc_id}/{major_version_number__v}/{minor_version_number__v}/
Download Limits
Section link for Download LimitsVault File Staging automatically removes downloaded files after the job’s Expiration Date.
Vault Loader cannot extract more than 2,000 files (source files and renditions) in a single job.
Example: Downloading with CLI
Section link for Example: Downloading with CLI>> java -jar VaultDataLoader.jar -exportdocument -source -where "id contains (1634, 1635, 1636)" -async
>> java -jar VaultDataLoader.jar -jobstatus 10381
ID Type Status Progress Start Time End Time Expiration Date
10381 export-documents Success 3 / 0 2016-11-28 2016-11-28 2016-12-14
04:24:25AM GMT 04:24:25AM GMT 04:24:25AM GMT
>> java -jar VaultDataLoader.jar -jobresults 10381
file,id,external_id__v,rendition_type__v,major_version_number__v,minor_version_number__v
/10381/1636/0_1/doc3.txt,1636,,,0,1
/10381/1634/0_1/doc1.txt,1634,,,0,1
/10381/1635/0_1/doc2.txt,1635,,,0,1Next Steps
Section link for Next StepsNow that you know how to create and download documents, you can learn how to query document metadata attributes using VQL.