Integrating with Force.com Platform using DbAmp: Part-03: Insert

Typically, when data is uploaded to Salesforce, it will be usually an upsert operation, but it is not uncommon to use insert operation. For e.g. if you are importing a fresh set of data or importing data for new object(s), there is no need to do an upsert and insert is a perfect candidate for these kind of scenarios.
The insert operation using the DbAmp can be done either directly against a Salesforce org or can be done locally in a staged database and then can be inserted into Salesforce org. The main advantage for using the staged database is that while the direct access method is very slow, especially for bulk uploads, the local access method (the staged database) is much faster. Also, the direct access method affects the governor limits at a faster rate when compared to local access method, since in the direct access method, each record inserted count against the API limit, while in the local access method, even without using BULKAPI switch, it takes 200 record inserts to count against the API limit. We will cover both the scenarios here and demonstrate them with couple of examples.
Direct Access Method:
It’s very simple to use the direct access method. The following statement inserts new Contact record in the Contact object:
INSERT INTO sforce contact ( LastName) VALUES (‘TestLastName’ )
As you may guess, the auto-populate system fields such as the Id, CreatedBy, CreatedDateTime, LastModifiedBy, LastModifiedDateTime fields should be left blank. These fields will be auto populated by the Salesforce itself.
Staged (Local) Database Method:
In the staged database method, a database is created locally in the SQL server and the data is inserted using a temporary table called load table. Here is the brief steps on accomplishing this:
  • Create a database in the SQL server (one-time task)
  • Replicate / Refresh the table that you want to insert the record(s)
  • Create the load table. The load table is a physical table, but temporary in nature. This load table should include the following fields
    • Id, Error and any fields that are necessary or that you want to populate.
      • The ‘Id’ field should be left empty and should not be populated. When insert succeeds, this field is populated with the ‘Id’ returned by the Salesforce.
      • The ‘Error’ field is populated by DbAmp with the status of the insert operation.
  • Populate the load table
  • Insert the data from the table into Salesforce using the SF_BULKOPS stored procedure.
Note that the SF_BULKOPS is one of the several stored procedure that ships with the DbAmp. We will cover the options, nitty-gritty’s of this stored procedure in a separate blog post.
Now, let’s look at an example of doing insert with the staged database method. The following script populates a list of contacts in the load table and then does the SF_BULKOPS to insert those contacts into the Salesforce.
EXEC SF_REFRESH ‘SALESFORCE’, [Contact]
CREATE TABLE [dbo]. [Contact_Load]
  (
    [Id] [nvarchar] ( 100 ) NULL,
    [Error] [nvarchar] (4000 ) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
    LastName [nvarchar] ( 100 ) NULL
   )
INSERT INTO Contact_Load ( LastName) VALUES ( ‘Smith’ );
INSERT INTO Contact_Load ( LastName) VALUES ( ‘Adams’ );
INSERT INTO Contact_Load ( LastName) VALUES ( ‘Paul’ );
 
EXEC dbo . SF_BulkOps @operation = ‘Insert’, @table_server = ‘SFORCE’ , @table_name = ‘Contact_Load’
Let’s analyze what this script does.
  • The first line ‘EXEC SF_REFRESH ‘SFORCE’, [Contact]‘ refreshes the contact table from the Contact object in the Salesforce. The ‘SFORCE’ is the linked server name and the ‘[Contact]’ is the Salesforce Contact object. The reason that we have to refresh always before doing any CRUD operation is because, there could be many changes made between the last time the Contact object was refreshed and now. The SF_REFRESH brings the delta changes from the Salesforce Contact object to the local contact table in the SQL server.
  • The ‘CREATE TABLE [dbo]. [Contact_Load]…‘ creates the load table. As described previously, this table SHOULD have at least two fields which is ‘Id’ and the ‘Error’.
  • The ‘INSERT INTO Contact_Load…‘ statements inserts the records into these table. In the ideal world, the load table will be populated from a SELECT query that you will use to select the records from your tables.
  • The line ‘EXEC dbo . SF_BulkOps‘ is what actually inserts the data into Salesforce.
When you execute the script and the execution completes successfully, the contacts would have been inserted into the Salesforce Contact object. The ‘Id’ field will be populated with the Id for these new records and the ‘Error’ field will be populated with the status message. The older versions of the DbAmp used to leave the field empty, but somewhere in the documentation, I read the latest version of  DbAmp populates this field for even successful operations. Note that if you accidentally populate the Id field in the Contact_Load table with the existing record, the SF_BULKOPS will not detect that this is an existing record and will always insert this as a new record in the Contact object. In other words, if you have any records in the Contact_Load table, the SF_BULKOPS procedure will treat it as new records to be inserted and will do so.
Using Bulk API for INSERT
Salesforce.com provides two different types of APIs for sending (Insert, Upsert, Update, Delete) data to Salesforce.com. All the commercial tools such as Dell Boomi, Snaplogic, Cast Iron, Jitterbit, DbAmp, Pragmatic Works Task Factory, etc… and the open source tools such as Mulesoft, dataloader.io and even the Salesforce.com’s Data Loader use the same API to send the data to Salesforce.com. There is no other back door to deal with the Salesforce.com. The first of them which is the Web Services API is what highly used in most of the scenarios. The Web Services API is a synchronous API and can sends records in a batch of 200. That means if you 1000 records, then it makes 5 API calls to send all the 1000 records. The web services API should cater to the most integration needs because in a typical integration scenario the data moved is usually in few hundreds to thousands. But there are situations where bulk data movement is necessary which may run in few thousands to few hundred thousands of records. For such scenarios, Salesforce has a different set of API called the Bulk API. The Bulk API can support upto 10,000 records in one single API call. It is also asynchronous in nature which means that when a bulk API is used, the call is queued and executed whenever the system has the resources, which is usually, quite sooner. It is the responsibility of the caller to query and get the status of the queued call. So, if you need to send 100,000 records, with bulk API, it would take only 10 API calls, while the web services API would take 500 API calls.
DbAmp supports both the APIs and uses the web service API by default. If you need to use the bulk API, then it can be specified as an option. For e.g.
EXEC dbo . SF_BulkOps @operation = N’Insert:bulkapi’, @table_server = ‘SALESFORCE’ , @table_name = ‘Contact_Load’
 
One important thing to remember is that when you use the bulk API, the ‘Error’ column will be updated with a token and not with the status and you can use this token to track the status. To get the status, use the following command.
EXEC dbo . SF_BulkOps @operation = N’Status’, @table_server = ‘SALESFORCE’ , @table_name = ‘Contact_Load’
Conclusion
This article explored the basics of using DbAmp to insert data from local SQL database to Salesforce. There are much more things can be done with the insert operation and complete details can be found under the DbAmp documentation. The next article will focus on using the DbAmp to update the existing data in Salesforce from local SQL server.

Tagged: , , , ,

Leave a comment