By using this website, you agree to our Terms of Use (click here)
Sometimes, you want to load data into Acumatica, but you don't have a "home" for it.
It might be Historical Sales data, Historical Purchase data, Historical Invoice data, Marketing data, etc.
But the goal of ERP is to put everything in one place and have one version of the truth right?
Now, I'm not saying that Acumatica should pretend to be a Data Warehouse because it's not, but I do think that Acumatica can act as a Data Warehouse in some situations.
A while back, I wrote a blog post about how you can create custom screens for storing outside data that doesn't have a home in Acumatica.
The problem with that idea is that you have to create a separate custom screen for every scenario, carefully defining the columns that you need and the data types for each column. It's very time-consuming.
Wouldn't it be nice if we had one screen that would allow us to define a home for any type of outside data, as many different homes as we need, with different numbers of columns and data types in each "home"?
As I have been considering use cases for the soon to be released AugSQL product, it occurred to me this weekend that we already have a screen in Acumatica that allows us to do this.
This was a really cool realization and it further validates my theory that, in general, the world doesn't need more software, but just needs to do a better job of creatively using the software that it already has.
"What is this magical existing screen", I hear you ask? Why it's the Import by Scenario (SM206036) screen of course. It only took me 10 years to realize this!
I'm accustomed to thinking of the PREPARE and IMPORT buttons on that screen as being one step, something that you always do together.
But why not just use the PREPARE button and allow the data to sit there forever, acting has an "Acumatica Duct Tape Data Warehouse" of sorts?
Interesting idea right?
I'm posting this as a Discussion Topic rather than a Blog Post because I haven't finished off the idea and I'll need to continue the conversation, hopefully with the help of some of you, but I did some work on it this weekend and wanted to document my findings so far.
Rather than continue to wax philosophical, let's go through this with an actual example.
The Data Providers (SM206015) screen already has the ability to connect to multiple sources of data (Excel, CSV, etc.). You can set up your connections there and bring in as many columns as you need.
In this example, I'm connecting to a very simple Excel data source with 6 columns and 50 rows of Historical Sales data:
Now let's connect our Excel file to the Data Provider. It doesn't matter how many columns I have in my data source. The Data Provider will pick them up automatically.
Easy right?
Before we go to the Import by Scenario (SM206036) screen, we have to connect our Data Provider to an Import Scenario using the Import Scenarios (SM206025) screen.
But we don't have to actually do anything here. This is just a necessary step. The Acumatica screen that we connect the Data Provider to doesn't matter because we aren't ever going to map any fields to it or ever run the import.
As you can see here, I've picked the Companies (CS101500) screen, but you could pick any screen that you like. It doesn't matter.
You don't have to do anything on the MAPPING tab because the Import Scenario isn't going to "do" anything. We're just using it as a place to store data. My MAPPING tab is empty as you can see here:
Now that we've connected our DW - Historical Sales Data Provider to an Import Scenario (also called DW - Historical Sales to keep things simple), we can go to the Import by Scenario (SM206036) screen and press the PREPARE button. The result looks like this:
We're done! We just loaded external data into Acumatica and Acumatica is now acting as a Data Warehouse for that data.
Cool huh?
This technique has been staring me in the face for over 10 years, but I didn't discover it until this weekend.
Now for the tricky part, and the reason why I started this Discussion Topic so I could capture what I've learned so far.
Even though there are 6 columns of data in the previous screenshot, all of that data got stored in one column of one table in the Acumatica database. Which column and which table? This column and this table:
SYData.FieldValues
The tricky part is understanding how the data got stored so we can extract it back out into multiple columns to do reporting on it.
Through much trial and error and Googling different things this weekend, I figured out the first part. Maybe someone out there can help me with the second part. I'm not planning to spend any time on the second part this weekend, but I'll revisit it at another time. Maybe, by the time I do, someone will already have figured it out and posted here.
Since all of the data, regardless of how many columns there are, gets stored in one column, Acumatica needs to know how to split it back out into multiple columns.
It does this by using a special Unicode character to "delimit" the data so it knows when to begin a new column. You don't see it though when you query the data using SQL. If you query SYData.FieldValues in SQL Management Studio, it looks like Acumatica only stored the first column in SYData.FieldValues:
Before you call me a liar, keep reading 😀
The character is NULL and can be represented as UNICODE(0) in SQL.
In order to see all the data, let's replace the UNICODE(0) delimiter with something else that we don't expect will ever appear in the data. How about three dashes.
Now, I'm not exactly sure why you have to do it this way, but you can't simply use REPLACE. The SQL code is a little more complicated. I don't fully understand why, but you have to do this:
Here's that SQL code for you so you don't have to re-type it from the screenshot:
SELECT TOP 5 LineNbr,FieldValues, REPLACE(FieldValues COLLATE Latin1_General_BIN, nchar(0x00) COLLATE Latin1_General_BIN, '---') AS 'Replaced Delimiter' FROM SYData JOIN SYMapping ON SYData.CompanyID=SYMapping.CompanyID AND SYData.MappingID=SYMapping.MappingID WHERE SYMapping.Name='DW - Historical Sales' ORDER BY LineNbr
Now that we can see all of the data, we just need to use our new --- delimiter to split the data back out into separate columns, but I haven't figured that out yet so I'm stopping here for now.
I think this is a really cool idea because we can store data in an Import Scenario, then we should be able to use a SQL View to pull it back out.
I just need to figure out how to get the SQL View to put it back into separate columns so we can report on it.
More to come...
Or maybe one of you already know what to do?
Interesting approach and doable given some limitations:
1. Views in SQL Server want to have a static return schema (although SELECT * is allowed, SQL Server is able to infer the columns that will be returned based on the query and table schema(s) in use at the time of view creation). To ensure that, you'd be stuck with something like Col1, Col2, Col3 etc. for column names and support for a fixed number of columns. The data would be sparsely populated so if your data set contained fewer columns than the view supports the values would just be NULL but you would drop data that is wider than the view supports.
2. The performance might be a bit of an issue since everything has to be derived at runtime and there will be no way to index for performance. For small data sets it would be negligible but the more data sets are prepared and kept, and the larger those data sets get, performance could degrade rapidly.
Just a thought for AugSQL that builds upon your idea though: you could offer upload of a file that would auto-create the underlying table, and from that you could use a Customization Project to efficiently create a DAC that can be used for reporting (no page/BLC needed). AugSQL would replace the Import Scenario but then you let Acumatica do its thing. The tool could infer or allow user input on data type selection and performance could be improved through indexing.
Once you convert the UNICODE(0) delimiter to a readable delimiter, I think this method could be used to split it into multiple columns. I still need to prove it out though: