When SnapTrade was just getting started, we focused only on official APIs provided by financial institutions. The rationale was simple enough: all official APIs have official specs. We just need to code to the spec and our integration will be perfect. Turns out, this was a naive thought.
We realized very quickly just how naive we had been. This is how we came to understand that we were in uncharted territory and that we were the people to map it.
When You Realize About Transaction Data That "The Map Is Not The Territory"
After building the first iteration of the SnapTrade backend and successfully pulling holdings data from several stock brokers, we decided to go deeper and start collecting transaction data.
We chose as our subject a broker where we had a pretty good spec for historical account transactions from observing our test accounts. We figured there couldn't be more than a few dozen types. With the thousands of user accounts we had available, surely we'd be able to get a good sample and quickly support all the types.
Sure enough, within a week we were successfully pulling data thousands of connected accounts. As expected, we started seeing unknown transaction types almost immediately, cropping up as errors in the transaction type mapping function. We document the additional transaction types that weren't listed in the spec and continued on.
Sure enough, a few days later we started seeing the same error again, but for another new transaction type. Our team added another line of code and we carried on.
A few days later, it happened again. This time, we didn't even get the first issue fixed before a second case popped up.
What was happening?
It didn't slow down, either. It kept on going. Almost every week we encountered an entirely brand new transaction type coming across the API. A couple of months went by, leading to the realization that we severely underestimated the difficulty of the problem.
To reconcile this we went to the broker we had specced, with whom we had a good relationship, and asked them, "Hey, do you have a full list of your transaction types?" And wouldn't you know it, they couldn't even give one to us. Not that we weren't permitted to see such a list - they didn't have a list at all. In fact, this was the reason it wasn't in the API spec.
Luckily, they had some incredible team members who we were able to get us enough sample data to develop a comprehensive list for ourselves. When all was said and the dust had settled, they had compiled a list of 261 different transaction types.
261! At just one broker! The challenge was much larger than we could have imagined.
Over 100 Different Transaction Types Every Week
Building integrations with financial institutions has always been a process of discovery more akin to science than engineering. Complete, well-specced documentation is rare.
Having built integrations at over 20 brokerages, a big part of SnapTrade's integration process now involves producing our own documentation based on observed data we receive from the APIs we connect to.
Every day we see transaction data coming across the API. Every day we sort through all these different transactions to identify and log the different transaction types.
You know what we've found?
There are over 100 different transaction types coming over the API on a weekly basis!
This is challenge when discovering new transaction types data is complicated by the data itself. Most of them are difficult to map because it isn't clear what a transaction with type [XYZ] and description ["ajkbak"] even means.
When running any app designed to help users manage their investments, having access to historical transaction data can be incredibly valuable. However, with many brokerages offering disparate transaction types - let alone data formats and details - creating a unified experience for the end user isn't just challenging. It can be totally maddening.
Key Data To Analyze User Buying History
Ok, it's difficult enough to comprehend how there are over 100 distinct transaction types across multiple the brokers. It then gets even more crazy once you have to deal with each of those types has a purchase and sale date, security, and transaction amount. Then the rabbit hole widens even further! Those types could also be showing differently for:
- Ticker Symbol
- Company Name
- Exchange
- Trade Date
- Buy/Sell Indicator
- Quantity
- Price Per Share
- Total Transaction Value
- Commission
- Trade Type (e.g., Market, Limit, Stop Loss, etc.)*
- Settlement Date
- Order Execution Time
- Transaction Status
- Realized Gain/Loss*
- Dividends Received
- Adjusted Basis (after splits, dividends, etc.)*
- Average Cost Per Share
While this data may be available from some brokerage firm systems, it’s not always easily parsed and utilized. For example, some brokerages will return the units and price information in a description. It could return a text string that says: [bought five units of Apple at $68.5]. That data has to be parsed out.
Other difficulties are brokerages that return unformatted numbers or even use foreign languages or currencies.
To deliver a functional investment experience, user data is expected to be as accurate and include a long history of historical transaction data. That's challenging to deliver.
Breaking Down Historical Transactions Activity Data
Here's a hypothetical example to better understand the challenges of utilizing historical transaction data. Let’s say you connect to multiple investment company’s data feeds and get trade data. Here's how different institutions could have it formatted (data edited for privacy):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
{
"fees": "0.00",
"state": "completed",
"amount": "20.00",
"direction": "deposit",
"state": "completed",
"created_at": "2024-02-01T11:05:17.117603-05:00",
"updated_at": "2024-02-02T15:18:40.993938Z",
"status_description": "",
"early_access_amount": "0.00",
"expected_landing_date": "2024-02-02",
"instant_limit_to_grant": "0.00",
"expected_landing_datetime": "2024-02-02T09:00:00-05:00"
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
{
"type": "TRADE"
"netAmount": -1820,
"subAccount": "2",
"description": "BUY TRADE",
"settlementDate": "2022-04-25",
"transactionDate": "2022-04-21T12:46:11+0000",
"transactionItem": {
"cost": -1820,
"price": 1.82,
"amount": 1000,
"instrument": {
"cusip": "23257B107",
"symbol": "CYN",
"assetType": "EQUITY"
},
"instruction": "BUY"
},
"transactionSubType": "BY",
"cashBalanceEffectFlag": true
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
{
"fee": 0,
"price": 546,
"netAmount": -1092,
"tradeDate": "2024-01-12",
"accountType": "BRKG",
"shareQuantity": "2.00000",
"sequenceNumber": "3001720379",
"settlementDate": "2024-01-17",
"transactionCode": "BUY",
"transactionType": "Buy",
"cwrSequenceNumber": "",
"monetaryInstructionId": "",
"transactionDescription": "NVIDIA CORP",
"checkWritingRedemptionCheckNumber": "0"
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
{
"amount": "-$847.74",
"symbol": "VOO",
"__typename": "History",
"txnCatDesc": "Investment Activity",
"brokCsvData": " 12/14/2021,\" YOU BOUGHT VANGUARD INDEX FUNDS S&P 500 ETF USD (VOO) (Cash)\", VOO,\" VANGUARD INDEX FUNDS S&P 500 ETF USD\",Cash,2,423.87,,,,-847.74,2211.82,12/16/2021",
"cashBalance": "$2,211.82",
"description": "YOU BOUGHT VANGUARD INDEX FUNDS S&P 500 ETF USD (VOO) (Cash)",
"detailItems": [
{
"key": "Date",
"value": "12/14/2021",
"__typename": "DetailItem"
},
{
"key": "Symbol",
"value": "VOO",
"__typename": "DetailItem"
},
{
"key": "Symbol Desc.",
"value": "VANGUARD INDEX FUNDS S&P 500 ETF USD",
"__typename": "DetailItem"
},
{
"key": "Type",
"value": "Cash",
"__typename": "DetailItem"
},
{
"key": "Shares",
"value": "+2.000",
"__typename": "DetailItem"
},
{
"key": "Price",
"value": "423.8683",
"__typename": "DetailItem"
},
{
"key": "Amount",
"value": "-$847.74",
"__typename": "DetailItem"
},
{
"key": "Settlement Date",
"value": "12/16/2021",
"__typename": "DetailItem"
}
Crazy, right? The data sets mostly contain similar data, but the formatting is inconsistent. It would be nearly impossible to parse without a significant manual effort.
Due to the almost universal lack of transparency across brokerages, our process for handling this is a formalized, iterative process of logging new transaction types, and intentional observability.
An added advantage is having a network size that allows for more data and a larger sample size. It becomes much easier for to identify transaction mapping issues, prioritize fixes according to how common/severe the issue is, and implement the fixes in a way that improves the network as a whole rather than piecemeal.
Three Key Uses of Historical Transaction Data
Consider these potential use cases if you’re debating the value of offering users a unified data set bringing together their portfolios from multiple brokerage accounts.
Buying Behaviour
Understanding buying behavior through historical transaction data can significantly enhance an investment app’s usefulness.
By aggregating buy and sell transactions across multiple accounts, the app can reveal patterns and preferences in investment choices, such as favoring certain sectors or risk levels. This insight allows investors to evaluate their biases and diversification strategies effectively, enabling more informed decision-making.
Insights into buying behavior can also assist in predictive analytics. If an investor consistently reacts to specific market conditions by purchasing particular types of assets, the app could use this historical data to suggest future buys under similar conditions. This proactive approach helps users to capitalize on trends and optimize their investment strategies.
Tendencies to Current Events
Historical transaction data can clarify how investors respond to market fluctuations and global events, offering valuable context for risk management.
An investment app could analyze past data to show how users adjusted their portfolios in response to geopolitical tensions or economic announcements. This type of analysis helps users understand the impact of external factors on their investments and refine their response strategies.
Additionally, recognizing patterns in reactions to current events can empower the app to offer timely advice or automated trading features. By identifying common tendencies among users during specific events, the app can provide targeted recommendations to mitigate risks or take advantage of market opportunities. This real-time strategic guidance can be a significant differentiator for users looking to enhance their investment outcomes.
Authentication
Robust authentication methods are critical in financial applications to ensure the security of sensitive data and transactions.
Historical transaction data can be instrumental in developing behavioral biometrics that recognize unusual transaction patterns or login attempts, which might indicate unauthorized access. This layer of security helps protect users from potential threats and builds trust in the app’s infrastructure.
Instead of relying solely on traditional authentication methods, which can be cumbersome or vulnerable to breaches, incorporating modern verification to link accounts can provide a more adaptive and robust security framework, ensuring users' assets and data are well-protected.
Why Is It So Difficult to Get Users Their Historical Transactions
As we mentioned in our anecdote above, building an investment app presents a litany of challenges. In particular, apps that combine data from multiple sources or enables users to enter trades, ranging from technical issues to restrictions imposed by other businesses.
- Data Access Restrictions: Financial institutions often have strict controls over who can access transaction data, necessitating complex permission systems and agreements to access such data legally. These restrictions can significantly hinder developers from effectively pulling historical transactions.
- Inconsistent Data Formats: Each financial institution stores data in proprietary formats, making it challenging for developers to normalize this data across various sources for consistent application performance and user experience.
- Privacy and Security Regulations: Compliance with financial privacy laws and regulations adds complexity to accessing and handling user data. Ensuring all data transmissions meet these legal and industry standards can slow the integration process and increase development costs.
- API Rate Limits and Costs: Financial institutions often limit the frequency of data requests through their APIs to prevent overloaded systems, which can delay data retrieval. Additionally, accessing these APIs can incur costs that escalate with the requested data volume, impacting the feasibility of offering extensive historical data access.
- Authentication and Security Protocols: Robust security measures are necessary to protect sensitive financial data but can complicate user verification processes. While crucial for protecting data, each additional security can introduce more friction and delays in user data access.
- Legacy Systems: Older financial systems may not be designed to interact seamlessly with modern APIs, resulting in slower data retrieval and increased difficulty in integrating with current technologies. These legacy systems can significantly delay or complicate accessing historical data.
- Delays in Data Synchronization: There can be inherent delays in how data is synchronized between different systems and platforms, leading to outdated or incomplete data being displayed to users. This misalignment can affect the accuracy of the transaction history provided.
- Historical Data Volume: Managing large volumes of historical data can be challenging, especially when ensuring data remains accessible and performant within the app. The greater the historical data volume, the more resource-intensive it is to store, manage, and retrieve efficiently.
The Last Word
Many weeks after we had received the comprenehisve transactions type list from the broker, we received an email from a different point of contact we had at the broker. They asked us for the list.
Turns out, a potential customer of ours who we had told about all the transaction types, which we had learned of from the list the broker gave us, had gone to the broker independently to request that very list for themselves. But our point-of-contact had no idea a list like that even existed within their own company!
That's just how deep this issue can go, that even the brokerages struggle to manage the chaos of transaction types. Learning how to observe and log new transaction types has helped us bring some form of order to this wild west of financial data.
*This data is not available over the SnapTrade API for Transactions, but is available through Orders data
Connectivity to
the world's top
brokerages
20+ Brokerage Integrations
Access more than 125 million account holders.