Extracted from a thread began on @OP+1jgxdnf0h
207 replies (most recent on top)
Another former CAS team member here, keen to read more... Please continue...
TKTS was poorly scoped and lacked technical leadership with true, world class expertise in database technology.
I think that BH and JM would disagree. EVERYONE was told to read up on ODBC.
Why are you doing this? What’s the point? This has nothing to do with layoffs. I suppose that with the world going to he11 in a basket, it doesn’t matter.
Part 1b
OS and the rise of high-performance computing at SAS.
Sometime around 2002, OS began his SAS R&D career as an analytics developer. He was formerly an academic and University Professor, holding a PhD in forestry, although with a very strong academic biostatistics and heavy applied math background. He brought a work ethic and focus that was definitely next level compared to rank-and-file SAS R&D devs. OS not only excelled at developing complex analytics procedures, but also learned non-trivial details in the platform services code that surfaces data and provides various memory management, I/O and thread abstractions to the analytic algorithms. He brought a sense of urgency and innovation to his work that eventually caught the attention of JG. Although OS’ lack of formal academic computer science background causes some folks here to chafe, remember that JG is also a statistician (who wrote quite a bit of systems code himself back in the day) and that SAS has deep roots in analyzing agricultural data.
One of the initiatives at this point (around 2005) was “bringing SAS analytics to the data” in contrast to the more traditional method of doing ETL in the process of importing data into SAS for further processing and analysis. The continual rising volumes of data and newer more intelligent storage appliances as well as Hadoop motivated this. Data “has gravity” and when you have a lot, moving and replicating to these platforms or traditional DBMS’ can be expensive and unwieldy.
OS was an early innovation leader in helping bring SAS analytics to the data through HPA and related in-database technologies. OS did not just “come up with ideas“. He worked his a-s off over long hours to write code and demonstrate how things could work. He led this initiative with a small team that built the core HPA components while working with a larger, yet limited cast of characters across the analytics, data management, and core divisions within R&D. Many of these folks would eventually become founding members of the CAS development team.
His efforts on HPA brought OS onto the international SAS users scene as he traveled with JG, meeting with some our largest users across the world. The hard lessons learned building HPA and the in-process database connectors along with intelligence gathered during these years (roughly 2006 to 2009) led to the need for another evolution in parallelized, high performance SAS computing. Enter the LASR Analytic Server.
To adequately summarize the need for LASR is to first recognize that every computing platform (be it a hardware storage appliance that runs a combination of complex proprietary and open source OS like FreeBSD or Linux, or an open source big data system like Hadoop, or proprietary DBMS) must support a considerable member of similar services, several of which extend all the way to ISA primitives in the underlying HW. While there are some common conventions, there is also impedance mismatch especially with things like error and exception handling. This makes it very difficult for a system like SAS that has its own memory management, I/O, task/threading, etc. Subsystems to perfectly integrate, especially at the OS process level with other non-SAS software components — i.e. the very storage appliances in systems at HPA and the in-process data connectors interface with.
Hence, a strong motivator for LASR was to achieve and even exceeded the performance benefits of HPA for SAS specific analytical processing, while doing so in an homogenous computing framework that wasn’t constantly butting heads with the runtime of an underlying storage appliance.
"SPDS was the brainchild of an extremely difficult individual who few could tolerate working with for any length of time"
Someone put initials to the above "brainchild".
Part 1a
2000-2010 : Evolving to what ultimately became Viya
Beginning in this approximate timeframe, SAS R&D was fresh off the wasted years of version 7 (props to whoever reminded us of that) and among a plethora of other issues, attempting two different attempts at developing advanced data management infrastructure — SPDS, which had began around 1994 and TKTS. Remnants of these technologies remain in the bowels SAS to this day. However, neither was able to deliver on a comprehensive nor successful product strategy that would become ubiquitous at SAS customer sites. SPDS was the brainchild of an extremely difficult individual who few could tolerate working with for any length of time. TKTS was poorly scoped and lacked technical leadership with true, world class expertise in database technology.
Outside of all SAS, yet certainly with our customers, the world-wide volume of data was continuing to grow exponentially, mirroring the explosion of Internet based technologies, public cloud computing and eventually IoT.
Yet, at that point, was there an identifiable, singular analytics and data management technology that everyone could agree dominated? I think we can agree, the answer was NO — incumbent vendors like SAS were scrambling to enhance their core products (developed for earlier computing, paradigms) in time to capture new market share. New players, led by hyper-scalar cloud vendors were inventing technologies and creating products native to Internet and cloud environments.
When compared with 2015-2025, Open Source management and analytics was mostly nascent in 2000-2010. However, in this timeframe Map Reduce and Hadoop did come into being and continued on a steep ascent while SAS rushed to integrate, as we had been doing with the in-database effort, for storage appliances and of course earlier classic data sources on mainframe, mini, and PC computers — I.e. the computing paradigms MVA was built for.
Many Viya/CAS core team members started their SAS R&D careers designing and building key MVA components. I think it’s fair to say we understood an ingrained ethos, enshrined by JG himself — SAS builds platform components ultimately to deliver data management and subsequently analytics on the resulting prepared data. SAS was simply not investing in the scale of Development it would take to successfully build a modern cloud or database platform at the scale of our largest competitors like Oracle, Google, AWS, IBM, etc., and ultimately of course, open source software.
Systems-level developers serve SAS’ primary mission which is analytics. Our job was to essentially create abstraction layers to interface with underlying operating systems, target DBMSs and network communication technologies while providing common runtime services like memory management, multithreading and SAS-native data formats, etc. to enable our analytics as fast as possible.
I’m a former Viya core team member and to my recollection OS had JG’s blessing to build out CAS as the primary Viya engine in support of the “new fast train” that would replace the “older SAS V9 train”. Strict compatibility between the two was not an initial mandate, yet intermittently became a tenuous subject as Viya Development continued. The old adage “hindsight is always 20/20” certainly applies here.
An important premise for understanding the dynamic you are describing is that internal discussions regarding Viya design date back to 2010 - 2012. CAS, Viya’s core engine, is the evolution of earlier efforts to advance traditional SAS technology going back to 2005 (beginning with HPA/In-Database, and finally LASR).
Let’s consider the confluence of macro and micro circumstances occurring around 2010. This will need to be a multi post comment: