Thanks
in large part to modern advances in modern software and technology, we can
acquire or create data on just about everything. This new-found ability to
easily (and cheaply) generate and store data will likely have a very profound
impact on the procurement profession. This will be the first of many posts
relating to data science and its value to the procurement professional. In the future,
I will author posts focused on data acquisition, data cleaning, data visualization,
and predictive analytics. These posts will provide insights into how data can
be processed and manipulated in order to help drive savings and support spend
management.
It
is important to note that these articles will likely be technical in nature. As
my background is mathematics and statistics, I will generally include pertinent computations,
derivations, and technical details. It is my intention to offer
an elementary example in each post, including code, which might be used by an
organization looking to pursue strategic procurement. I will include a mix of
various technologies that we frequently employ at Source One such as R, Python,
Bash/DOS, SQL, VBA, and C/C++. Occasionally I may mix in parallel /
distributed programming.
Before
we proceed any further we should clearly define data science. According to
Wikipedia, data science is the extraction of knowledge from structured or
unstructured data. We will go one step further and say that our intended use
for data science is developing and extracting knowledge from data for the
purposes of cost reduction via effective decision making. We add this last sentence to emphasize that we should be able to derive actionable insights from the data.
What
exactly is data science useful for? Simply put – anything relating to
automation and prediction. With regard to automation, we can perform accurate
and efficient computations that allow us to instantaneously make informed
business decisions while simultaneously saving man-hours and ensuring accuracy
of solution. Alternatively, once we have implemented a means to aggregate and
cleanse data we may use it to increase organizational visibility and mine it
for predictive purposes. Some very particular uses within the realm of
strategic spend include baselining and comparative analysis, spend analysis, customer
profiling, geography analysis and optimization, and external factor tracking
(such as weather).
For my first blog post series, I will use the R programming language. R is a free
software environment for statistical computing and data visualization. I have chosen R for two reasons. First, it is
a statistical language that offers excellent visualization capabilities as
demonstrated here: R
statistical software. Second, R is likely to become a very useful business tool in the near future, as it is currently being integrated into Microsoft’s existing
technology. See Revolution
Analytics.
As we must have data in order to perform analysis, the first series I publish will be on Data Acquisition. I’ll update the links below as the articles go live.
As we must have data in order to perform analysis, the first series I publish will be on Data Acquisition. I’ll update the links below as the articles go live.
- Data Acquisition & WebScraping via R: Structured Data I
- Data Acquisition & WebScraping via R: Structured Data II
- Data Acquisition & Web Scraping via R: Structured Data III
- Data Acquisition & Web Scraping via R: Unstructured Data I
- Data Acquisition & Web Scraping via R: Unstructured Data II
- Data Acquisition & Web Scraping via R: Unstructured Data III
Just the perfect post for procurement professionals, have read many books but this is the simplest and well written article.
ReplyDeleteRegards,
Anurag