The Tabscanner API enables you to upload an image of a receipt or an invoice and get back the result in JSON format.
The document formats supported for upload are jpg and png.
You can
The standard results provided by Tabscanner will be very good on most receipts and invoices, however, for extreme levels of accuracy, custom configurations and training may have to be performed. Please get in touch with our team to discuss your individual requirements.
Tabscanner uses a short polling system. First the request is submitted via the
process
endpoint where it is submitted into a queue for processing before returning a token.
This token is then used to short poll the results
endpoint for the result. an image will normally take ~5 seconds to process, so we would recommend waiting around this long before polling on a one second interval for the result.
Your API key allows access to make calls on your account. This must be kept secret at all times. Tabscanner must be called from a back-end application server and not a client side application directly. For this reason you won't find any API references for Swift/Android/Objective-C in our documentation.
All calls to Tabscanner are over SSL, we do not provide a non-ssl version of the API. All data uploaded to Tabscanner is encrypted at rest and deleted after 90 days. The results from your calls to tabscanner will be available for 90 days after the initial call. Therefore it is important to manage the persistent storage of images and results in your application.
https://api.tabscanner.com
Tabscanner authenticates via an API key passed as an http header named apikey
.
Your API key can be found by logging into your Tabscanner account and locating it under the API details section.
200 - OK | Request authenticated successfully |
---|
400 - Error | ERROR_CLIENT: API key not found |
---|
When an error occurs during the call to the API Tabscanner returns a JSON object with details of the error. The attribute code holds the details of the error that occurred.
ATTRIBUTES
message |
A message describing the error. |
status |
The status for the request. success or failed
|
status_code |
An integer. The status code of the request |
success |
A boolean. If the request was successful or not. |
code |
An integer, the error code of the request. |
API RESPONSE CODES 200 - Process request submitted successfully 202 - Result available 300 - Image uploaded, but did not meet the recommended dimension of 720x1280 (WxH) 301 - Result not yet available 400 - API key not found 401 - Not enough credit 402 - Token not found 403 - No file detected 404 - Multiple files detected, can only upload 1 file per API call 405 - Unsupported mimetype 406 - Form parser error 407 - Unsupported file extension 408 - File system error 500 - OCR Failure 510 - Server error 520 - Database Connection Error 521 - Database Query Error
Tabscanner aims to be a very simple API and has only 2 endpoints for processing. process
and result
.
The version of the API to use is passed in the URL after /api/. The current version is 2
. The result endpoint does not require a version number as it is implied by the call to process
.
We have provided code samples for the following languages to help fast-track your integration:
As Tabscanner is not for use directly on phone apps we provide no documentation for Swift, Objective-C or Android.
https://api.tabscanner.com/api/2/process https://api.tabscanner.com/api/result
The process
endpoint allows the submission of an image.
It is a multipart/form-data
POST
request.
The request should contain the file you would like to upload, as well as the parameters for processing the image. All arguments are passed as form-data.
ARGUMENTS
file REQUIRED |
The image file. Can accept JPG and PNG file formats. |
decimalPlaces optional |
Accepts an integer value should be 0, 1 or 3. A hint for what to look for on the receipt. It can improve accuracy if you know the number of decimal places in advance. This is not related to number formatting. |
cents optional |
Accepts a boolean value. Convert numbers without decimal places to cents. Only works with receipts set to 3 decimal places. (e.g. 1.574 = 1.574, 245 = 0.245) |
documentType optional |
Accepts a string value. Must be receipt, invoice or auto. The default is receipt. Specify the type of document to be processed. If set to auto Tabscanner will attempt to auto-detect the document type. |
defaultDateParsing optional |
Accepts a string value. Must be m/d or d/m. In the case of an ambiguous date eg. 02/03/2019 this parameter determines if the date is understood as day followed by month or month followed by day. |
region optional |
The 2-alpha ISO country code of the supported country. This will take into consideration number and date formats and language configurations among other configurations to improve the accuracy of the results.
Listed below are the iso codes along with any custom fields that are available for the given region.
Argentina - ar Australia - au
|
ATTRIBUTES
token |
A string. The token used to poll for the result |
duplicate |
A boolean describing if the same image has previously been uploaded. |
duplicateToken |
A string that is the token of the first seen duplicate of the upload. |
message |
A message describing the status of the request. |
status |
The status for the request success or failed .
|
status_code |
An integer. The status code of the request |
success |
A boolean. If the request was successful or not. |
code |
An integer, the error code of the request. |
string url = "https://api.tabscanner.com/api/2/process";
string fileName = "path/to/imageFile/image.jpg";
string key = 'yourapikey';
var client = new RestClient(url);
var request = new RestRequest(Method.POST);
request.AddFile("file", fileName, "image/jpeg");
request.AddHeader("apikey", key);
IRestResponse response = client.Execute(request);
var content = response.Content; // raw content as string
require 'rest_client'
require 'json'
API_KEY = 'yourapikey'
endpoint = 'https://api.tabscanner.com/api/2/process'
form = {
:receiptFile => File.new('app/path/to/imageFile/image.jpg', 'rb')
}
headers = {apikey: API_KEY}
response = RestClient.post(endpoint,form,headers)
json = JSON.parse(response)
token = json["token"]
import requests
import json
from dotenv import load_dotenv
import os
# load your environment containing the api key
BASEDIR = os.path.abspath(os.path.dirname(__file__))
load_dotenv(os.path.join(BASEDIR, '.env'))
API_KEY = os.getenv("API_KEY")
def callProcess():
endpoint = "https://api.tabscanner.com/api/2/process"
receipt_image = "app/path/to/imageFile/image.jpg"
payload = {"documentType":"receipt"}
files = {'file': open(receipt_image)}
headers = {'apikey':API_KEY}
response = requests.post( endpoint,
files=files,
data=payload,
headers=headers)
result = json.loads(response.text)
return result
$url = 'https://api.tabscanner.com/api/2/process';
$cFile = curl_file_create('app/path/to/imageFile/image.jpg');
$post = array('receiptImage' => $cFile);
$apikey = 'yourapikey';
$headers = array(
"apikey:" . $apikey
);
$cSession = curl_init();
curl_setopt($cSession, CURLOPT_URL, $url);
curl_setopt($cSession, CURLOPT_POST, 1);
curl_setopt($cSession, CURLOPT_POSTFIELDS, $post);
curl_setopt($cSession, CURLOPT_RETURNTRANSFER, true);
curl_setopt($cSession, CURLOPT_HEADER, false);
curl_setopt($cSession, CURLOPT_HTTPHEADER, $headers);
$result = curl_exec($cSession);
if (curl_errno($cSession)) {
$result = curl_error($cSession);
}
curl_close($cSession);
$json = json_decode( $result );
$token = $json->token;
String APIKEY = "yourapikey";
HttpResponse jsonResponse = Unirest.post("https://api.tabscanner.com/api/2/process")
.header("accept", "application/json")
.header("apikey", APIKEY)
.field("receiptImage", new File("path/to/imageFile/image.jpg"))
.asJson();
'use strict';
// load your environment containing the secret API key
require('dotenv').config()
const API_KEY = process.env.API_KEY
const fs = require("fs");
const rp = require("request-promise");
async function callProcess(files, params, ) {
let formData = {
file: []
}
for (var i = 0; i < files.length; i++) {
const file = files[i]
formData.file.push({
value: fs.createReadStream(file),
options: {
filename: file,
contentType: 'image/jpg'
}
})
}
formData = Object.assign({}, formData, params);
const options = {
method: 'POST',
formData: formData,
uri: `https://api.tabscanner.com/api/2/process`,
headers: {
'apikey': API_KEY
}
};
const result = await rp(options)
return JSON.parse(result)
}
(async () => {
try {
const imageFile = 'app/path/to/imageFile/image.jpg'
let result = await callProcess([imageFile], {})
// this token is used later to request the result
const token = results.token
console.log(token)
} catch (e) {
console.log(e)
}
})();
package main
import (
"net/http"
"os"
"bytes"
"path/filepath"
"mime/multipart"
"io/ioutil"
"io"
"log"
"github.com/buger/jsonparser"
"time"
)
func main() {
filePath := "./app/path/to/imageFile/image.jpg"
apikey := "yourapikey"
file, _ := os.Open(filePath)
defer file.Close()
body := &bytes.Buffer{}
writer := multipart.NewWriter(body)
part, _ := writer.CreateFormFile("file", filepath.Base(file.Name()))
io.Copy(part, file)
writer.Close()
r, _ := http.NewRequest("POST", "https://api.tabscanner.com/api/2/process", body)
r.Header.Add("Content-Type", writer.FormDataContentType())
r.Header.Add("apikey", apikey)
client := &http.Client{}
response, _ := client.Do(r)
processBody, _ := ioutil.ReadAll(response.Body)
token,_ := jsonparser.GetString(processBody, "token")
log.Println(token)
}
The result
endpoint returns the result of the processed document.
It is a GET
request.
The path of the request should contain the token returned in the related process
call.
ARGUMENTS
token REQUIRED |
A string. The token by the process call. |
ATTRIBUTES
status |
The status for the request done , pending or failed .
|
status_code |
An integer. The status code of the request |
success |
A boolean. If the request was successful or not. |
code |
An integer, the error code of the request. |
result |
An object containing the result data |
Result Object
GENERAL DATA | |
---|---|
establishment | A string. The establishment name detected on the receipt. This works by using a combination of machine learning and custom configurations. If you process a finite set of establishments, then accuracy can be dramatically improved via custom configurations. |
date | A string. The purchase date and time in the format YYYY-MM-DD hh:mm:ss |
dateISO | A string. The purchase date and time in ISO format YYYY-MM-DDThh:mm:ss |
total | A float representing the total amount. |
subTotal | A float representing the subtotal. |
cash | A float representing the amount of cash paid. |
change | A float representing the amount of change returned to the purchaser. |
tax | A float representing the total amount of tax. |
taxes | An array with a breakdown of all the tax amounts discovered. |
serviceCharges | An array with a breakdown of all the service charges discovered. |
tip | A float representing the total amount of the tip. |
discount | A float representing the total discount applied to the receipt. |
discounts | An array with a breakdown of all the discounts applied to the receipt. |
rounding | A float representing an amount of rounding applied to the receipt. |
address | A string containing the address of the establishment found on the receipt. this string is not normalized and contains all address info that was extracted. |
addressNorm |
The normalized version of the address attribute broken down as follows:
|
url | A string representing the website address extracted from the receipt. |
phoneNumber | A string representing the phone number extracted from the receipt. Phone number formats are not normalized and are extracted in the form found in the receipt. |
paymentMethod | A string representing the payment method found in the receipt.
|
barcodes | An array of barcodes extracted from the receipt. Each item in the array is itself an array with 2 indexes. The first index represents the barcode data and the second index represents the barcode type.
Supported types are:
|
currency | A string. The detected currency extracted from the receipt. Currently support currencies:
|
expenseType BETA |
A string representing the expense classification of the receipt. Support types are:
|
customFields |
|
documentType | A string representing the type of document detected when auto is passed to the documentType parameter to the process endpoint.
Supported values are:
|
LINE DATA | |
lineItems | An array of LineItem objects representing the products found in the receipt.
|
summaryItems | An array of LineItem objects representing lines that were not products eg. Total, Cash, Change etc.
|
CONFIDENCES | |
totalConfidence | A float value ranging from 0 to 1 representing how confident the system is that the total field is correct and is in fact the total. |
subTotalConfidence | A float value ranging from 0 to 1 representing how confident the system is that the total field is correct and is in fact the subtotal. |
taxesConfidence | An array containing float values ranging from 0 to 1 representing how confident the system is that the taxes fields are correct and are in fact the taxes. |
serviceChargeConfidences | An array containing float values ranging from 0 to 1 representing how confident the system is that the service charge fields are correct and are in fact service charges. |
tipConfidence | A float value ranging from 0 to 1 representing how confident the system is that the tip field is correct and is in fact a tip. |
discountConfidences | An array containing float values ranging from 0 to 1 representing how confident the system is that the discount fields are correct and are in fact the discounts. |
cashConfidence | A float value ranging from 0 to 1 representing how confident the system is that the cash field is correct and is in fact cash. |
changeConfidence | A float value ranging from 0 to 1 representing how confident the system is that the change field is correct and is in fact change. |
roundingConfidence | A float value ranging from 0 to 1 representing how confident the system is that the rounding field is correct and is in fact rounding. |
dateConfidence | A float value ranging from 0 to 1 representing how confident the system is that the date field is correct. |
establishmentConfidence | A float value ranging from 0 to 1 representing how confident the system is that the establishment field is correct. |
validatedEstablishment | A boolean indicating that the establishment has been cross-referenced with the phoneNumber or address on the receipt and resolved confirmed in our database. |
validatedTotal | A boolean indicating a very high confidence score for total. (0.99) |
validatedSubTotal | A boolean indicating a very high confidence score for subtotal. (0.99) |
LineItem Object
lineTotal | A float representing the total extracted from the line. |
descClean | A string containing the consolidated and cleaned product description found in the line. This will include any supplemental descriptions found on lines adjacent to the lineTotal. It will also be cleaned of any price or discount information found in the description. |
desc | A string containing the text found on the same line as the lineTotal. |
qty | A float representing a quantity of a product found on the line. This will default to 0 rather than 1 and will only return 1 if it finds a 1. |
price | A float representing the price extracted from the line. |
unit | A float representing the unit extracted from the line. |
productCode | A string representing a productCode found in the line. |
symbols | An array of strings representing any symbols found in the line. Typically these are tax codes assigned to each line after the lineTotal. |
supplementaryLineItems | In the event the system was unable to resolve text above and below lineTotals to a single line, a dictionary containing an array of text found above and below the line item will be available. Note: the system always attempts to resolve this automatically, however the event of a failure, this dictionary is returned as a fallback. |
lineType | If the line item is a summary item, the system will attempt to classify the line item as one of the following:
|
Tabscanner does not attempt to detect if an image is valid or not, but a number of fields can be used in conjunction to achieve this. For example, a total of zero with a confidence score of zero, combined with an empty date and empty establishment field would strongly indicate that the image was unreadable as a receipt. We leave the implementation of this up to the calling application as use-cases vary and no one algortithm will suffice.
string token = "yourtoken";
string url = "https://api.tabscanner.com/api/result/" + token;
string key = 'yourapikey';
var client = new RestClient(url);
var request = new RestRequest(Method.GET);
request.AddHeader("apikey", key);
IRestResponse response = client.Execute(request);
var content = response.Content; // raw content as string
require 'rest_client'
require 'json'
API_KEY = 'yourapikey'
token = 'yourtoken'
headers = {apikey: API_KEY}
endpoint = 'https://api.tabscanner.com/api/result/'
response = RestClient.get(endpoint + token, headers)
json = JSON.parse(response)
puts json
import requests
import json
from dotenv import load_dotenv
import os
# load your environment containing the api key
BASEDIR = os.path.abspath(os.path.dirname(__file__))
load_dotenv(os.path.join(BASEDIR, '.env'))
API_KEY = os.getenv("API_KEY")
def callResult(token):
url = "https://api.tabscanner.com/api/result/{0}"
endpoint = url.format(token)
headers = {'apikey':API_KEY}
response = requests.get(endpoint,headers=headers)
result = json.loads(response.text)
return result
$url = 'https://api.tabscanner.com/api/result/$token';
$token = 'yourtoken';
$apikey = 'yourapikey';
$headers = array(
"apikey:" . $apikey
);
$cSession = curl_init();
curl_setopt($cSession, CURLOPT_URL, $url);
curl_setopt($cSession, CURLOPT_RETURNTRANSFER, true);
curl_setopt($cSession, CURLOPT_HEADER, false);
curl_setopt($cSession, CURLOPT_HTTPHEADER, $headers);
$result = curl_exec($cSession);
curl_close($cSession);
$json = json_decode( $result );
String APIKEY = "yourapikey";
String endpoint = "https://api.tabscanner.com/api";
Unirest.get(endpoint + "/result/{token}")
.routeParam("token", token)
.header("apikey", APIKEY).asJson();
'use strict';
require('dotenv').config()
const API_KEY = process.env.API_KEY
const fs = require("fs");
const rp = require("request-promise");
async function callResult(token) {
const options = {
method: 'GET',
uri: `https://api.tabscanner.com/api/result/${token}`,
headers: {
'apikey': API_KEY
}
};
const result = await rp(options)
return JSON.parse(result)
}
(async () => {
try {
// your token from the previous process call
const token = 'yourtoken'
let result = await callResult(token)
console.log(result)
} catch (e) {
console.log(e)
}
})();
endpoint := "https://api.tabscanner.com/api/result/" + token
rr, _ := http.NewRequest("GET", endpoint, nil)
rr.Header.Add("apikey", apikey)
clientr := &http.Client{}
resp, _ := clientr.Do(rr)
result _ := ioutil.ReadAll(resp.Body)
establishment,_ := jsonparser.GetString(result, "result","establishment")
log.Println(establishment)
The credit
endpoint returns the number of credits left on the account.
It is a GET
request.
It returns a single json number.
Simply call the /credit endpoint with a header named apikey containing the api key.
Advanced features may be available to you depending on the status of your account. They include features such as:
Custom fields can be enabled on your account where Tabscanner will implement special rules to extract specific data from a receipt or invoice. Some examples of custom fields we currently extract are:
When capturing detailed line items from receipts one major challenge is when the receipt lines span multiple lines of the receipt. For example, the line total maybe be on a different line to the description and the price and discount on another. In the event of many lines on the receipt, it can often become difficult to tell which description belongs to which line total or price.
Tabscanner will attempt to resolve these relationships automatically, but in some cases custom training of the system may be required to achieve very accurate results.
In general there are 3 ways to improve the results returned by the API
Our technology has been engineered to handle many different image anomalies, however, the following guidance will give your images the highest possible accuracy.
If you have any technical questions not answered by this document or need technical support, please email your issue to support@tabscanner.com.