IADSS Workshop at KDD
Washington DC, 2022
Data Science Standards – Hiring, Assessing and Upskilling Data Science Talent
KDD 2022, AUGUST 16 – Workshop Proposal
IADSS successfully hosted its third workshop at KDD on Data Science Standards - Hiring, Assessing and Upskilling Data Science Talent.
This half-day workshop included 6 presentations and 1 panel with a total of 11 presenters.
We shared detailed findings from IADSS research on industry needs and practices around external and internal talent pipeline development for firms big and small, with an understanding of the data science job market.
Background, Definition and Audience of the Workshop:
As the imperative to become more data-driven continues to permeate every aspect of business, the demand for data science talent (and more broadly, for data-literate employees across the spectrum) has exploded. The resulting talent pipeline challenge has led organizations to pursue both external hiring and internal upskilling. Competition for this talent is fierce – and, given the multi-disciplinary nature of data science, qualified resources are scarce. Confusion around skills and knowledge of what makes a data scientist, a data analyst or a data engineer makes this even more problematic. In many cases, organizations end up looking for data science ‘unicorns’, i.e. rare breed of data science professional qualified across the vast range of skills from computer science to machine learning, from cloud computing to data storytelling or they hire the wrong person for the job, often resulting in frustration on the behalf of the employer and the employee. The task of developing and managing an analytics talent pipeline, external and internal, identifying the right individual for the organization’s needs through an effective and efficient assessment process and as a result attaining a high-performing data science team is unfortunately an objective not very easily achieved, and usually reserved for the largest tech employers with seemingly limitless budgets.
The first workshop in this series was held at KDD-2019, organized by the Initiative for Analytics and Data Science (IADSS), where contributors discussed challenges in defining data related roles and ways to standardize skills and knowledge required for a variety of roles in analytics and data science space. The second workshop was held in KDD 2020, virtually, bringing together some of the top thinkers of the field on the topic of training data scientists. Co-chaired by Usama Fayyad and Xiao-Li Meng, Founding Editor-in-Chief of HDSR, aimed to discuss how we can better train data scientists with the skills and knowledge that the industry needs and explores how the industry and academia can collaborate to make sure we’re not only meeting the demands of today but also preparing for the changes and challenges of the future.
This 3rd workshop in the series, went further into industry needs and practices and conducts ongoing research in this domain, and we shared detailed findings and observations, in addition to contribution from researchers and industry practitioners through an open call for papers / presentations and invited speakers. In order to achieve intended aim, workshop has held as a half-day working meeting with short talks, invited panels and discussion sessions to plan for future steps in the topic.
Attendance at the first two workshops: The first workshop held at KDD-2019 attracted over 200 attendees and involved very active and vigorous discussion of the hot topics. The program for the first workshop is listed on the KDD-2019 web site. The second workshop attracted a large virtual audience (exact number was not reported to us) and featured talks by co-organizers Usama Fayyad and Xiao-Li Meng as well as thought leaders such as Tom Davenport (Babson College) and Jeannette Wing (Columbia University).
Previous Conference Presence
KDD 2020, 2nd IADSS Workshop on Data Science Standards – What do you need to know as a Data Scientist? Special Workshop Theme: Training Data Scientists of the Future. Half-day workshop.
Co-Chairs: Usama Fayyad, Open Insights / IADSS & Xiao-Li Meng, Harvard University
Tom Davenport, President’s Distinguished Professor of Information Technology and Management at Babson College
Keith McCormick: Data Science Author and Instructor, UC Irvine and LinkedIn Learning
Jeannette Wing: Director of Data Science Institute, Columbia University
Pavlos Protopapas, Scientific Program Director for the Harvard Institute of Applied Computational Science
Kjersten Moody, Chief Data Officer at Prudential Financial
Bin Yu, Professor at UC Berkeley
Rayid Ghani, Professor at Carnegie Mellon University
KDD 2019 Anchorage, 1st IADSS Workshop on Data Science Standards
“Proposed Standards on Definitions of Analytics Roles, Skill-sets and Career Paths in the Data Science Industry”
Ming Li, Research Scientist, Amazon; University of Washington
Stacey Schwarcz, Founder, Ariel Analytics
Ary Bressane, Head of Data Innovation Lab, mnubo
Ying Li, Chief Scientist, Eureka Analytics
Amy Shi-Nash, Global Head of Analytics and Data Science, HSBC
Matt Curcio, VP Data, Ripple
Agenda of the Workshop
Biographical Summaries of the Organizers
Usama M. Fayyad, Ph.D.
Usama serves as founder/CEO of Open Insights (founded in 2008) where he works with large and small enterprises on Artificial Intelligence/Machine Learning, BigData strategy, and launching new business models based on Data Assets: Most recently serving as Interim CTO for Stella.AI, a VC-funded startup in AI for HR/recruiting; and Interim COTO of MTN2.0 – helping develop new revenue streams in mobile payments/MFS and Data-as-a-Service businesses at MTN, Africa’s largest mobile operator.
Usama was the first Global Chief Data Officer & Group Managing Director at Barclays in London (2013-2016) where he also took on additional role as CIO of Risk, Finance & Treasury Technology in 2015. From 2010-2013 Usama was the co-founder of OASIS-500, a tech startup investment fund, following his appointment as Founding Executive Chairman in 2010 by King Abdullah II of Jordan. Up until joining Barclays in 2013, he was also Chairman, Co-Founder and Chief Technology Officer of Blue Kangaroo Corp building a mobile search engine service for offers personalization and activation based in Silicon Valley. His background includes Chairman/CEO roles at several startups, including DMX Group (acquired by Yahoo!) and digiMine (Audience Science) which was founded in 2000 in Seattle to build hosted data warehousing and data mining solutions for Fortune 500 companies.
He was the first person ever to hold the Chief Data Officer (CDO) title when Yahoo! acquired his second startup in 2004. In addition to CDO, he was also the Executive VP of Research and Strategic Data Solutions where he was running Yahoo!'s global data strategy, architecting its data policies and systems, and managing its data analytics and data processing infrastructure. The data teams he built at Yahoo! collected, managed, and processed over 25 terabytes of data per day, and drove a major part of ad targeting revenue and data insights businesses globally. He also founded Yahoo! Research Labs where much of the early work on BigData made it open source and established the early collaborations that launched Hadoop and other open source contributions.
Usama held leadership roles at Microsoft (1996-2000) and founded the machine learning systems group at NASA's Jet Propulsion Laboratory (1989-1995), where his work on machine learning resulted in the top Excellence in Research award from Caltech, and a U.S. Government medal from NASA.
Usama earned his Ph.D. in engineering in AI/Machine Learning from the University of Michigan. He holds two BSE’s in Engineering, MSE Computer Engineering and M.Sc. in Mathematics. He has published over 100 technical articles on data mining, data science, AI/ML, and databases; and holds over 30 patents, is a Fellow of the Association for Advancement of Artificial Intelligence (AAAI) and a Fellow of the Association of Computing Machinery (ACM). He is active in the academic community with several adjunct professor posts and is the only person to receive both the ACM’s SIGKDD Innovation Award (2007) and Service Award (2003). He has edited two influential books on data mining and served as editor-in-chief on two key industry journals. He is an active angel investor and advisor in many early-stage tech startups across the U.S., Europe AAI at UTS, and The University of Michigan College of Engineering.and the Middle East. He served on the boards or advisory boards of several private and public companies including Criteo, Invensense, RapidMiner, Stella.AI, Martini Media, Virsec, Silniva, Abe.AI, Medio, NetSeer, Choicestream, and others. On the academic front, he is on advisory boards of the Data Science Institute at Imperial College,
Hamit has over 20 years of industry and consulting experience in the areas of analytics, customer relationship management and marketing strategies driven by data. He is a co-Founder of Analytics Center, a company focused on the use of data and analytics in business as well as an advisor or investor in several analytics related initiatives that work in developing vertical machine learning solutions for industries such as advertising and e-commerce.
Hamit was a Founding Partner for EMEA offices of Peppers & Rogers Group, the leading customer-led business strategy consulting firm based in the U.S. He then led the development of the firm in the region by serving clients across the Middle East, Africa and Europe. He also worked as a Partner for the firm’s US office heading up its global Analytics group. In this capacity he oversaw the growth of the analytics practice and helped his clients develop analytics functions, build data infrastructure and deploy analytical models to support business goals.
His industry experience includes several positions within Federal Express in Memphis in marketing analytics, and technology where he led IT and business teams to leverage the enormous amount of data the company generated to serve its customers better.
Hamit is also a frequent speaker, writer and board member at various start-ups and non-profit organizations. He earned his B.Sc. degree in Electronics Engineering at Bogazici University in Istanbul and his MBA degree at the University of Florida.
Dr. Umesh Hodeghatta Rao is an Engineer, a Scientist, and an Educator. He is a faculty member at Northeastern University, specializing in Analytics, AI, Machine Learning, Deep Learning, Natural Language Processing (NLP), Big Data Analytics and Cyber Security. He has more than 25 years of work experience in technical and senior management positions at AT&T Bell Laboratories, Cisco Systems, McAfee, and Wipro. He was also a faculty member at Kent State University, Kent, Ohio, and Xavier Institute of Management, Bhubaneswar, India. He has his master’s degree in Electrical and Computer Engineering (ECE) from Oklahoma State University, USA and Ph.D. from the Indian Institute of Technology (IIT), Kharagpur. His research interest is applying AI Machine Learning to strengthen an organization’s information security based on his expertise on Information Security and Machine Learning. As a Chief Data Scientist, he is helping business leaders to make decisions and recommendations linked to the organization’s strategy and financial goals, reflecting an awareness of external dynamics based on data driven approach.
Dr. Hodeghatta has published many journal articles in international journals and conference proceedings. In addition, he has authored books titled “Business Analytics Using R: A Practical Approach” and “The InfoSec Handbook: An Introduction to Information Security” published by Springer Apress, USA. Dr. Hodeghatta has contributed his services to many professional organizations and regulatory bodies.
He was an Executive Committee member of IEEE Computer Society (India); Academic advisory member for the Information and Security Audit Association (ISACA), USA; IT advisor for the government of India; Technical Advisory Member of the International Neural Network Society (INNS) India; Advisory member of Task Force on Business Intelligence & Knowledge Management; He is listed in Who’s Who in the World in the year 2012, 2013, 2014, 2015 and 2016. He is also a senior member of the IEEE, USA.
Mark Wagy is a senior data scientist and manager of the AI Solutions Hub at the Institute for Experiential AI (EAI). He also teaches AI as an adjunct professor at Northeastern University College of Professional Studies.
Wagy brings a unique background and perspective to EAI, having worked as a data leader, scientist, and engineer in academia and industry. He was chief technology officer and co-founder of a financial technology company specializing in debt optimization called Solve Finance. He also served as senior director of data science and engineering teams after being one of the first data scientists at WEX, built data search systems as an engineer at LexisNexis, and developed numerical models as a scientist at Medtronic.
While a postdoc at Dartmouth college, Wagy worked on brain-inspired machine learning algorithms. He has also researched the ways in which humans and machines work together to solve complex problems at the University of Vermont and worked on massive-scale machine learning systems at MIT. He earned his doctorate of philosophy in computer science from the University of Vermont. He also received a bachelor’s degree in computer science from University of Minnesota and bachelor’s degree in mathematics and physics from Lewis & Clark College.
Ashok N. Srivastava, Ph.D. is the Principal Investigator for the Integrated Vehicle Health Management research project at NASA. His current research focuses on the development of data mining algorithms for anomaly detection in massive data streams, kernel methods in machine learning, and text mining algorithms.
Dr. Srivastava is also the leader of the Intelligent Data Understanding group at NASA Ames Research Center. The group performs research and development of advanced machine learning and data mining algorithms in support of NASA missions. He performs data mining research in a number of areas in aviation safety and application domains such as earth sciences to study global climate processes and astrophysics to help characterize the large-scale structure of the universe.
Dr. Srivastava is the author of many research articles in data mining, machine learning, and text mining, and has edited a book on Text Mining: Classification, Clustering, and Applications (with Mehran Sahami, 2009). He is currently editing two more books: Advances in Machine Learning and Data Mining for Astronomy (with Kamal Ali, Michael Way, and Jeff Scargle) and Data Mining in Systems Health Management (with Jiawei Han).
Dr. Srivastava has given seminars at numerous international conferences. He has a broad range of business experience including serving as Senior Consultant at IBM and Senior Director at Blue Martini Software. In these roles, he led engagements with numerous Fortune Global 500 companies including Bank of America, Chrysler Corporation, Saks 5th Avenue, Sprint, Chevron, and LG Semiconductor.
He has won numerous awards including the NASA Exceptional Achievement Medal for contributions to state-of-the-art data mining and analysis, the NASA Distinguished Performance Award, several NASA Group Achievement Awards, the IBM Golden Circle Award, and the Department of Education Merit Fellowship. He holds a Ph.D. in Electrical Engineering from the University of Colorado Boulder.
Greg Makowski has been Head of Data Science Services at Foghorn Systems Inc. since February 20, 2018 and heads the Data Science Services (DSS) group, providing data mining and big data consulting services for FogHorn clients. He has over 26 years of experience in data mining, deploying 90+ models for clients in globally. He was Director of Data Science at LigaDATA. He has a patent named Event Lift Forecasting, which is an automated forecasting for retail promotion events
Greg Makowski is also the Vice Chair at Chair of Data Science SIG, which is a local chapter of the Association of Computing Machinery (ACM).
Makowski has a Master degree in Computer Science from Western Michigan University.
A behavioral scientist, Paula Payton is an expert at harvesting consumer and market insights from data to shape better product, digital, and retail experiences. As a faculty member with Columbia University's School of Professional Studies’ Applied Analytics degree program, she teaches Analytics + Leading Change, as well as the Integrated Capstone project, the culminating educational experience for students in the program. Her research interests concern generating behaviorally anchored insights and developing executable strategies for organizations to reach and serve their consumers. Payton is currently spearheading global research on how enhanced organizational agility in data-driven retail companies drives sustained financial performance.
Payton’s work uncovering market trends, emerging technologies, and behavioral shifts have been leveraged by retail, fashion and consumer goods companies. She has consulted for companies such as Arcadia, Luxottica, Walgreens, Best Buy, Dollar General, H. E. Butt Grocery, Buehler’s, Sprint, Intercontinental Group of Department Stores, Cisco, Intel, Avon, Coca-Cola, General Mills, Novartis, GlaxoSmithKline, Pepperidge Farm, Procter & Gamble, Johnson & Johnson, S. C. Johnson, British Airways, and General Electric. Most recently, she has been part of the senior leadership team of a tech start-up, commercializing innovative technologies and new data streams to positively impact health and wellness.
Payton has held key roles in higher education as an administrator, researcher, and instructor. She has taught executive, graduate, and undergraduate courses for INSPER Institute of Education + Research (Såo Paolo, Brazil), Lundgren Center for Retailing (University of Arizona), Kelley School of Business (Indiana University), and New York University School of Professional Studies (NYU-SPS). As a former academic director of NYU-SPS’ Strategic Communication, Marketing, and Media Management Department, she was responsible for all integrated marketing, public relations/corporate communication, and media management graduate degrees and programs, bringing her background in consumer insight, digital innovation, and brand strategy to bear upon the position. Prior, she worked in senior staff and consulting roles for at-retail marketing (POPAI) and retail/consumer goods (RILA) trade associations, where she co-chaired marketing and innovation working groups of senior executives in the industry. She began her career at an agency, using marketplace and consumer insight to help Fortune 100 companies build brands and customer experiences profitably.
In addition to earning a B.A. and M.A., Payton has completed postgraduate diplomas in marketing strategy (Cornell University Johnson Graduate School of Management), and data science (Johns Hopkins University Bloomberg School of Public Health). She is currently studying operations and digital supply chain management through MIT’s Sloan School of Management.
Keith McCormick is an independent data mining consultant, trainer, author, and conference speaker. He has a wealth of consulting experience in statistics, predictive analytics, and data mining. For many years, he has worked in the SPSS community, first as an External Trainer and Consultant for SPSS Inc., and then in a similar role with IBM. He possesses a BS in Computer Science and Psychology from Worcester Polytechnic Institute. He is an expert in IBM's SPSS software suite including IBM SPSS Statistics, IBM SPSS Modeler, AMOS, and Text Mining. He is active in statistics groups online and blogs at KeithMcCormick.com. More recently he has been developing courses at Lynda.com to complement his authoring of books. He enjoys hiking in out of the way places, finding unusual souvenirs while traveling overseas, exotic foods, and old books.