Artificial Intelligence & Machine Learning , Next-Generation Technologies & Secure Development , Video
BigID CEO on How to Govern Unstructured Data Informing LLMs
Dimitri Sirota on Why It Is Simpler to Manage and Secure Data Found in SpreadsheetsOrganizations struggle with governing the data that goes into and informs large language models since it's in documents rather than spreadsheets or SQL databases, said BigID CEO Dimitri Sirota.
Companies need a governance framework for managing unstructured data that determines if personal identifiable information such as Social Security numbers or confidential health data is being shared with a large language model, Sirota said. Unstructured data has relied on its own set of technologies for management and security since the blocks of text have no enumeration and could include image data or binary data (see: The 'Privacy First' Strategy).
"Historically, most of the governance tooling has been oriented around structured data - data that feels or looks like a spreadsheet that could be in a SQL database or a data warehouse," Sirota said. "And so there's a little mismatch in terms of the data that you historically have governed for your analytics and the data that you need to govern to inform your conversational AI."
In this video interview with Information Security Media Group, Sirota also discussed:
- How putting a governance framework around unstructured data is different;
- How data minimization can help organizations reduce their attack surface;
- How BigID's approach to data privacy differs from OneTrust and Securiti.
Sirota, a privacy expert and identity veteran, co-founded and has led BigID since its inception in early 2016. He is a serial entrepreneur, investor, mentor and strategist. Sirota previously founded two enterprise software companies - eTunnels, which focused on security, and Layer 7 Technologies, which focused on API management and was sold to CA Technologies in 2013.