Alteryx is a superb tool for data manipulation and it’s generally very fast at what it does. However this only encourages us to put large volumes of data through its manipulation capabilities, which can cause annoying pauses during workflow development. Perhaps it’s because your source database is non-too-fast or simply whatever function you’re asking Alteryx to do over a billion rows of data is complicated.
Being inclined to somewhat iterative development, I’ve often had to go make several cups of tea whilst it whirs away. I had desperately longed for a fictional feature that would let me right-click on a tool and say “run from here”. In my dream world, this would take a copy of the data coming in from the selected tool’s input stream from the last time it ran, and process it through (only) the tools to the right of the selected tool. Thus if you wanted, for instance, to run some R routine on a a big pile of previously manipulated data, right-click run-from the R tool, and you would not have to wait for data to be collected and transformed again.
My dream world is not quite here, but in the mean time there’s a super-useful tool available from Alteryx employee MacRo on their community site – the Cache Dataset Macro.
Clear instructions are shown on their site, but essentially you pop it just before the tool you want to “run from here”. Set it to “Write” mode, which will then write out a file to disk of all the data coming into it the next time you run the workflow.
Then switch it to “Read” mode, disable all the tools to the left of it, and then when you next run the workflow it will ignore everything to the left of the cache tool and feed your “written” file into the tool it’s connected to.
It’s really equivalent to manually using an output tool to save out a file, and then an input tool to read it back in, but it’s life-changingly faster and less hassle than actually doing that.
Right now it doesn’t delete the files it writes (apparently that’s an upcoming feature – but in the mean time be sure to clear any massive ones out manually) and there’s no way for it disable the tools to the left of it automatically – you can use a tool container to do that – but nonetheless it’s a huge step up for my Alteryx productivity and one I am most grateful for!
Bonus point: I assume the icon MacRo chose for the tool is based on the obvious homophone-based pun, which is surely enough to brighten up the day of any long-suffering analyst.
One thought on “Awesome Alteryx cache tool from the Alteryx community”
Thanks for this post! I’m still a learner on this technology,this info seems to be really helpful. Thanks again!