Desktop version

Home arrow Computer Science

  • Increase font
  • Decrease font


<<   CONTENTS   >>

Universities Answer the Call

Table of Contents:

While the first university programs in “Data Science” - including “Analytics” - emerged in 2006-2007, the evolution of the discipline has been, well ... “lumpy”. As we will discuss in more detail in Chapter 4, the first wave of data science programs emerged at the master’s level. This is unusual; typically, academic disciplines will evolve at the undergraduate level, with progressively more specialized and deeper outlets for formal study developing into graduate and doctoral level programs. Data science programs were introduced at the master’s level, followed by the doctoral level (2015), with the first formal undergraduate programs finally being introduced in 2018. Part of the reason for the unorthodox evolution was the deafening call for talent from all sectors of the economy.

Tlte Davenport and Patil article called out “... The shortage of data scientists is becoming a serious constraint in some sectors”. In the same year (2012), the research firm Gartner4 reported that there was an expected shortage of over 100,000 data scientists in the United States by 2020 (the reality was closer to lOx this number). A year earlier, the heavily cited McKinsey Report5 titled “Big data: The next frontier for innovation, competition, and productivity” highlighted that ...“the United States alone faces a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts to analyze big data and make decisions based on their findings. The shortage of talent is just the beginning”. In 2014, the consulting firm Accenture reported that more than 90 percent of its clients planned to hire people with data science expertise, but more than 40 percent cited a lack of talent as the number one problem6.

Universities, not typically known for their responsiveness, ramped up programs at lightning speed (by academic standards).

However, if you wanted to recruit for “data science” talent at a university, where would you go? Should you go to the College of Computing? Would it be in the College of Business? Is it in the Department of Mathematics? Statistics? Is there even a Department of Data Science?

If you are not sure where you would go, that would make you incredibly normal - and one of the reasons we wrote this book. There is more variation in the housing of data science than any other academic discipline on a university campus. Why the variation? And why should you care?

The answer to the first question - Why the variation? - may not be straightforward. The Prussian statesman Otto Von Bismarck is quoted as saying, “If you like laws and sausages, you should never watch either one being made”. As in any organization, not all academic programs are a function of long-term, well-considered strategic planning - many programs evolve at the intersection of resources, needs, and opportunity. As universities began to formally introduce data science programs around 2006, there was little consistency regarding where this new discipline should be housed. Given the “academic ancestry” of statistics and computer science, as well as the domain specific areas of application (e.g., healthcare, finance, marketing, manufacturing) - it is not surprising that there is variation of placement of programs across the academic landscape. See Figure 1.3.

Academic location of programs in "data science", "analytics", and "machine learning"

Figure 1.3 Academic location of programs in "data science", "analytics", and "machine learning"8.

Exacerbating this, we do not yet have a universal consensus as to what set of competencies should be common to a data science curriculum — again largely due to its transdisciplinary foundations. The fields of computer science, mathematics, statistics, and almost every applied field (business, healthcare, engineering) have professional organizations and long-standing models for what constitutes competency in those fields. Simply combining them in an additive nature would make any such curriculum unyielding, ineffective, and not reflective of a unified discipline - or the demands of the marketplace. Importantly, these disciplines have different vocabularies, and describe similar concepts using different terms.

Tlte most comprehensive approach to a standardized data science curriculum has been conducted by an open source effort, called the “Edison Data Science Framework”9. A list of topics for inclusion in data science programs is provided below:

  • 1. Statistical methods and data analysis
  • 2. Machine learning
  • 3- Data mining
  • 4. Text data mining
  • 5. Predictive analytics
  • 6. Modeling, simulation, and optimization
  • 7. Big Data infrastructure and technologies
  • 8. Infrastructure and platforms for data science
  • 9- Cloud computing
  • 10. Data and applications security
  • 11. Big Data systems organization and engineering
  • 12. Data science/Big Data application design

It is important to reiterate that unlike fields like accounting, engineering, medicine, and law, there is no accrediting organization or standardized curriculum for data science.

As a result, the “data science” curriculum may look very different at different universities. This is a concept that we will discuss in more detail in later chapters.

The second question might be more relevant - Why should you care where a data science program is academically housed?

Generally, universities have approached the evolution of data science from one of two perspectives - as a discipline “spoke” (or series of electives) or as a discipline “hub” (as a major). See Figure 1.4.

Data science and analytics programs as academic "Hubs" and "Spokes"

Figure 1.4 Data science and analytics programs as academic "Hubs" and "Spokes".

Programs that are “hubs” - reflecting the model on the left — have likely been established as a “major” field of study. These programs are likely to be housed in a more computational college (e.g., Computing, Science, Statistics) or research unit (like a Center or an Institute) and will focus on the “science of the data”. They tend to be less focused on the nuances of any individual area of application. Hub programs will (generally) allow (encourage) their students to take a series of electives in some application domain (i.e., students coming out of a hub program may go into Fintech, but they may also go into healthcare - their major is “data”). Alternatively, programs that are “spokes” — reflecting the model on the right - are more likely to be called “analytics” and are more frequently housed in colleges of business, medicine, and the humanities. Programs that are “spokes” are (generally) less focused on the computational requirements and are more aligned with applied domain-specific analytics. Students coming out of these programs will have stronger domain expertise and will better understand how to integrate results into the original business problem but may lack deep computational skills. Students coming out of “hub” programs will likely be more comfortable moving along the continuum of data types highlighted in Figure 1.1, with students coming out of “spoke” programs, to be more likely work with data that is “Small, Structured, and Static” (which is actually sufficient for many organizations). Neither is “wrong” or “better” - the philosophical approaches are different. In addition, programs are constrained by the number of “credit hours’Vcourses that can be included. Most undergraduate programs will have about 40 courses in total — with less than 20 allocated to their “major”. Most master’s programs will have closer to 10 courses, while PhD programs are four to six years in duration but are heavily focused on research. As a result of these programmatic constraints, program directors have had to prioritize some concepts over others. Programs at each of these levels will be covered in Chapters 3, 4, and 5, respectively.

An ongoing longitudinal study tracking the salaries and educational backgrounds of data scientists10 provides a similar distinction between data scientists and analysts:

... predictive analytics professionals ... (are) those who can apply sophisticated quantitative skills to data describing transactions, interactions, or other behaviors to derive insights and prescribe actions ... Data scientists ... are a subset of predictive analytics professionals who have the computer science skills necessary to acquire and clean or transform unstructured or continuously streaming data, regardless of its format, size, or source.

In other words, all data science includes “analytics”, but all analytics does not include “data science”. Referring to the list of topics from the Edison Framework above, “analytics” programs would be more aligned with topics 1—5, where “data science” programs would be more likely to cover the full list. Throughout this book, we will refer to academic programs as “data science”, with specific distinctions for “analytics” programs as needed.

A Wide Range of Solutions

Over the next five chapters, we will detail how managers of analytical organizations can leverage university collaborations to address the unprecedented challenges and opportunities that have emerged through the evolution of data and the rise of data science. Examples include:

■ The need to develop a reliable pipeline of “known” and consistent talent - at the undergraduate, master’s, and doctoral levels. And the opportunity to contribute to updates and edits to the curriculum producing that talent as appropriate.

■ Working with faculty and doctoral students to facilitate innovation, research, and development for new products and services. Many companies use research collaboration as a way to position their organization as a “thought leader” in their industry.

■ Access to faculty to provide “ad hoc” employee workshops and training.

■ Formal opportunities for existing employees to “upskill” their knowledge.

■ Partnering with universities to contribute to the local community for “data science for social good”.

As we will discuss in the next chapter, at their core, universities do two things: they teach and they discover new knowledge. In the context of data science, while many universities do both well, they will have different approaches, different points of emphasis, and will have different motivations for working with you.

Endnotes

  • 1. https://cds.cern.ch/record/2027752?ln=en Accessed August 2, 2020.
  • 2. https://en.wikipedia.org/wiki/C_F_Jeff_Wu Accessed August 3, 2020.
  • 3. https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century Accessed August 2, 2020.
  • 4. https://blogs.gartner.com/doug-laney/defining-and-differentiating-the-role-of- the-data-scientist/ Accessed August 3, 2020.
  • 5. https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/ big-data-the-next-frontier-for-innovation Accessed August 2, 2020.
  • 6. https://www.accenture.com/us-en/_acnmedia/Accenture/Conversion-Assets/ DotCom/Documents/Global/PDF/Industries_14/Accenture-Big-Data-POV.pdf Accessed August 3, 2020.
  • 7. https://en.wikipedia.org/wiki/Otto_von_BismarckAccessed August 4, 2020.
  • 8. This data was aggregated by the authors from public academic websites.
  • 9. https://edison-project.eu/sites/edison-project.eu/files/attached_files/node-447/ edison-mc- ds-release2-v03.pdf Accessed August 2, 2020.
  • 10. https://www.burtchworks.com/big-data-analyst-salary/big-data-career-tips/the- burtch-works-study/ Accessed August 2, 2020.
 
<<   CONTENTS   >>

Related topics