Sign in to view Emily’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Emily’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Palo Alto, California, United States
Sign in to view Emily’s full profile
Emily can introduce you to 10+ people at Stripe
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
21K followers
500+ connections
Sign in to view Emily’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Emily
Emily can introduce you to 10+ people at Stripe
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View mutual connections with Emily
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Sign in to view Emily’s full profile
or
New to LinkedIn? Join now
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
About
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Experience & Education
-
Stripe
**** ** **** * **
-
********
** ** **** *******
-
***** ******** *****
******* ******
-
******* **********
****** ** ********** ******* ********** ** ********* undefined
-
-
********* **********
********** ******* ********** ** ********* undefined
-
View Emily’s full experience
See their title, tenure and more.
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View Emily’s full profile
-
See who you know in common
-
Get introduced
-
Contact Emily directly
Other similar profiles
-
Sam Segran
Sam Segran
Texas Cybersecurity, Education and Economic Development Council
3K followersLubbock, TX
Explore more posts
-
Bunty Shah
MSCI Inc. • 4K followers
🚀 Rethinking Agentic RL for LLMs: Driving Real-World Autonomy As a GenAI AI Architect, I see the industry moving rapidly from static, single-shot AI models to agents capable of multi-turn, autonomous problem solving. But what does it *actually* take to turn large language models (LLMs) into robust, multi-turn agents in open-ended environments? 📄 "A Practitioner's Guide to Multi-Turn Agentic Reinforcement Learning" offers actionable clarity. The authors dissect the chaos of current approaches and propose a practical framework grounded in real experimentation, not just theory. ✨ At its core, their recipe consists of: - 🏗️ Environment: Building up agent skills through progressively complex scenarios (think: spatial, object, and solution scale). Training on easier tasks still yields transferable skills—an insight that should shape our context engineering strategies. - 🎯 Reward: Structuring feedback matters. Dense milestone rewards accelerate training, but only if well-aligned with your RL algorithm. Too sparse, and learning stagnates; too frequent, and signals can get noisy. - 🧩 Policy: Smart initialization with demonstration-based priors (SFT) paired with online RL makes agents sample-efficient and more generalizable. The headline? It's the multi-turn design—not just algorithm choice—that really drives success. 🧪 Their benchmarks across TextWorld, ALFWorld, and SWE-Gym show LLMs growing beyond scripted responses toward true agentic behavior—adapting, generalizing, and operating in the wild. 🔍 For GenAI leaders, the biggest friction isn't always about model size—it's about crafting environments, rewards, and policies that unlock real autonomy. This work brings much-needed rigor and recipe-driven guidance, cutting through the hype with insights we can act on. 🤔 Which pillar do you find most challenging in agentic AI development: environment complexity, reward structure, or policy optimization? #AIResearch #GenAI #LLM #MachineLearning #DeepLearning #AIEngineering #ReinforcementLearning #AgenticAI
8
-
Michal Barodkin
NeuroEdit AI • 2K followers
LLMs are getting commoditized. The pace of core model breakthroughs has clearly slowed. We are no longer seeing step-function jumps from “just a bigger model”. For most real systems today: • architecture matters more than the base LLM, • planning, orchestration, memory, and verification dominate quality, • the model itself is increasingly a replaceable component. In my own code factory, I can swap GPT, Claude, or Gemini with minimal impact. There are differences in latency and edge cases, but they no longer define the system. The next real breakthroughs will not come from GPT-6. They will come from new architectures built around models, not inside them. LLMs are becoming infrastructure. Systems are becoming the product.
1
-
Lakshmi Shankar
Together • 3K followers
Thrilled to announce that Together Fund is investing in Sentra, alongside a16z speedrun! You track results in Jira. Decisions in Notion. Conversations in Slack. But the reasoning, the debates, trade-offs, and context behind why you chose A over B, disappears into what we call "Dark Matter." A decision made in March looks insane by July because no one remembers the constraints that made it smart. I lived this firsthand at Twitter scaling from 800 to 8,000 employees, and at Google while launching AI Overviews to billions at planet scale. The problem isn't process. Process is compensation for something deeper: organizational amnesia. An organization’s "Systems of Record" doesn’t solve this, they encode it. They store what happened, never why. That's why we are investing in Sentra. Sentra is the always-on collective memory that eliminates organizational amnesia by maintaining accurate context for all members and agents, functioning as an operational nervous system. It connects to every channel where work happens, meetings, Slack, email, code commits, docs, calendars, and treats them not as artifacts to search, but as living signals to synthesize. The fleeting and the permanent, unified into a memory that understands. The founding team is built for this: - Jae Gwan Park (CEO): Product-first founder, memory systems research at UofT and MIT - Ashwin Gopinath (CSO): Former MIT professor, created "Reflexion" (NeurIPS 2023), agents that learn from mistakes, 2x founder - Andrey Starenky (CTO): Early Vapi engineer, ex-IBM, built to process enterprise-scale data firehose Together is an operator-led fund. We invest in problems we've lived. This is one of them. Many congrats Jae, Ashwin and Andrey, we are so excited to partner with you! Read the full thesis: https://lnkd.in/gixj9cE4 Book a demo: https://www.sentra.app/ #OrganizationalMemory #AI #Sentra #TogetherFund #a16z #ContextGraphs
71
3 Comments -
Philip Karpowich
Liberty Mutual Investments • 608 followers
The build-vs-buy question in AI gets harder each week as the frontier races forward. Alexander Oettl, Sampsa Samila, and Sharique Hasan new working paper introduces the Generality-Accuracy-Simplicity (GAS) framework, a clear lens for deciding when an off-the-shelf model is enough and when a specialized stack earns its keep. GAS pinpoints where complexity really lives—inside the model, the organization, or the vendor—and turns “tea-leaf reading” into a structured strategy call. I’ll be using this framework in upcoming roadmap and governance discussions. Highly recommended read for anyone wrestling with AI investment choices.
11
1 Comment -
Anshu Agarwal
Converge • 6K followers
Yesterday, after Berkeley SkyDeck Demo Day, I had the privilege of attending a fireside chat with Prof. Ion Stoica (UC Berkeley professor, co-founder of Databricks, Anyscale, LMArena). The conversation, moderated by Chon Tang (co-founder of SkyDeck), touched on one of the most pressing debates in tech today: Should AI development be open source or closed source? Prof. Stoica’s perspective was both clear and thought-provoking: On siloed efforts: Many brilliant researchers are working at the frontier labs, but often in parallel — repeating similar work. From a human capital standpoint, that’s not efficient. On AI’s societal importance: If we believe AI is critical for society, then we must ensure our collective talent is applied in the most effective, responsible way. On collaboration and openness: For researchers to truly collaborate, they need shared artifacts. That means not just open weights, but open data, open algorithms, open evaluations — a full, 360-degree open source approach. It was an inspiring reminder that the way we structure openness in AI will deeply influence how fast — and how responsibly — we advance the field.
117
-
Shirly Ozer
Team8 • 2K followers
Solid (AI for Data) is out of stealth! I am excited to share the news about our portfolio company's public launch with $20M in seed funding! ✨ Solid is tackling the biggest bottleneck in Enterprise AI: the lack of context. They are building the AI enablement layer, that moves enterprises’ AI from pilot to real-world production. Huge congratulations to ProFounders Yoni Leitersdorf and Tal Segalov and the entire Solid team. I enjoyed being part of your early ideation and company-building journey! Now let's scale it big time 🚀 This is a generational company at the foundation of the enterprise AI stack, reflecting our belief that the AI era is defined by reliability and results. #SuccessbyDesign Read more: https://lnkd.in/dX6JH6J5 Aviad Harell, Amir Zilberstein, Nathaniel Tavisal, Omri Sela, Noa Bar-Yosef, Robert Wiseman, Nick Aharoni, Noa Hen, Tomer Tirosh, Alon Melcer, Jonathan Bergerbest, Omer Biran, Tal Levi, Aviv Turecki, Liran Grinberg, Ori Barzilay, Ori Yankelev, Asaf Azulay, Tal Levi, Aviv Turecki
89
3 Comments -
Grigory Sapunov
8K followers
An interesting new paper on LLM-JEPA from Hai Huang, Yann LeCun, and Randall Balestriero. 💡 Previously I wrote about JEPA approach applied to videos (V-JEPA and V-JEPA 2) and time series (CHARM). Now the JEPA approach is finally applied to LLMs! This work bridges a major gap between AI for vision and language, offering a potential leap forward in how we train language models. Instead of just predicting the next word, LLM-JEPA teaches models to understand the underlying meaning by predicting abstract concepts (as JEPA approach does)—for instance, grasping the essence of a code snippet from its natural language description. The paper introduces a hybrid objective combining standard next-token prediction with a Joint Embedding Predictive Architecture (JEPA) loss, a technique highly successful in computer vision. The empirical results are compelling: LLM-JEPA consistently boosts performance, accelerates parameter-efficient fine-tuning (PEFT), and shows remarkable resistance to overfitting. This method doesn't just improve scores; it fundamentally creates more structured and transferable representations. While the current computational overhead is a challenge to address, this paper opens a promising new direction beyond traditional LLM training. 🚀 Review: https://lnkd.in/eC4Jte_r Paper: https://lnkd.in/erZJadb3 Code: https://lnkd.in/ethXT7sX
75
3 Comments -
Sajith Pai
Blume Ventures • 85K followers
This section on how dbt Labs transitioned from a purely PLG motion to layering on enterprise is a fascinating one. Two instructive passages that I have bolded (below from First Round Review's path to PMF series featuring dbt Labs) TLDR: GTM comprises your ICP, channel, and message. When you transition from PLG / bottom up to Enterprise / top down motion, naturally your ICP and channel also changes, but your messaging / proposition needs to change too, e.g., the enterprise may be multiple personas and are buying assurance as much as solution. --- "Handy found product-market fit organically for dbt as an open-source tool mostly used by data practitioners and developers. But a few years into running a commercial business, he realized he had to build a growth curve all over again with C-suite data leaders. “Even though we had an unbelievable amount of market pull, as we initially commercialized, it wasn’t easy for us to transform this open-source command line tool into a product that enterprises would pay a million dollars for,” says Handy. “When you have enough product-market fit, sometimes it allows you to get away with not being super tight on product marketing or sales motions. So around 2022, we went from this gigantic acceleration curve and overnight we realized, we have to sit at the adults’ table and figure things out real fast,” says Handy. After the PLG flame started to fizzle, Handy turned his attention to layering on a sales-led motion for the cloud platform. “We had to focus our efforts on telling cohesive stories to senior data leaders. We had to have a very clear, explainable answer to the question, ‘Why should I use the commercial product and not the open-source product? And it had to be digestible by someone with a C in their title,” he says. Handy’s answer: dbt Cloud can handle complex data for companies of every size. “The longer people used dbt, the more complex their code became,” he says. “It was a problem for the most sophisticated dbt users, who were often at the largest companies. So there was a real opportunity for us to step in and solve that for them with dbt Cloud.” To tell that story to enterprise customers, Handy relied on data, naturally. “At a user conference we presented a chart that showed the number of dbt projects that had a certain number of models in them — over 100, over 1,000, et cetera,” he says. “We watched that number climb and we knew as users ourselves, ‘Oh my God, trying to work in a dbt project with 5,000 models in it is challenging.’ So we started with that quantitative data point and asked folks in our community about their experiences with these very large, complex dbt projects, and validated that this was a pain in the ass without a cloud platform.”
23
4 Comments -
Adam French
Antler • 17K followers
🧑🏫 AI for product development: We've been doing it wrong. Two new research papers show exactly how to fix it - one from Wharton, one from PyMC Labs + Colgate-Palmolive. PAPER 1: Ideation (Wharton) When you ask GPT "generate 100 ideas," it produces ideas 50-75% MORE similar than human teams. Solution: Chain-of-Thought prompting → Generate titles only → Force divergence ("make bolder, more different") → Then full descriptions Result: 27% more unique ideas PAPER 2: Consumer Testing (PyMC Labs + Colgate) Traditional surveys: £10K, 2 weeks per concept Direct LLM ratings: Useless narrow distributions Solution: Semantic Similarity Rating (SSR) → Synthetic consumer personas → Natural text responses → Convert to ratings via embeddings Result: 90% accuracy vs humans, £50 vs £10K Tested on 9,300 real consumers across 57 products THE PATTERN: Both show: naive prompting fails, structured prompting transforms results. ❌ "Give me ideas" or "Rate this 1-5" ✅ Multi-step processes with explicit constraints THE CAVEATS: - CoT needs human filtering for feasibility - SSR works for consumer products, not B2B/technical - Always validate finalists with real humans - Both require structured thinking and domain expertise THE OPPORTUNITY: Most companies are still "trying ChatGPT for everything." Winners will systematise AI for specific high-value use cases. These papers provide validated, immediately implementable approaches. Papers: 1. "Prompting Diverse Ideas" https://lnkd.in/egRS3u5Q 2. "LLMs Reproduce Human Purchase Intent" https://lnkd.in/eY7paNaC For product leaders spending £50K+ on customer R&D: this deserves a pilot. Thoughts? Who's implementing this? 👇 #ProductDevelopment #AI #Innovation #MarketResearch
27
1 Comment -
Sudarshan S.
Plaid • 1K followers
Excited to share that Plaid has launched AI-enhanced Transaction Categorization, a major step forward in delivering cleaner and more reliable financial data. Customers across the industry rely on accurate transaction insights to build intuitive, high-trust financial products, but inconsistent or coarse categorization can lead to confusing experiences and added manual work. We addressed this by developing a new AI-driven model and a more granular taxonomy that captures richer distinctions across income, loan payments, fees, transfers, and other key transaction types, enabling far more accurate and flexible classification. This advancement gives customers clearer views into financial behaviors, stronger logic to power product experiences, and better automation for downstream decisioning. It delivers up to 10 percent higher accuracy for primary categories and up to 20 percent for detailed subcategories, with a rollout that seamlessly supports both existing and new customers. https://lnkd.in/gurcK7-i
51
-
Eitan Anzenberg, PhD
Eightfold • 3K followers
Out team just posted our latest paper “Evaluating the Promise & Pitfalls of LLMs in Hiring Decisions” on arXiv! We found some exciting results: • Benchmarked leading LLMs (GPT-4o, o3, Claude, Gemini, Llama, DeepSeek) against Eightfold’s “Match Score” model on real-world data. • Evaluated both performance (ROC AUC, PR AUC, F1) and fairness (impact-ratio across gender, race, intersectional groups). • Eightfold’s Match Score beat the best LLM on accuracy (ROC AUC 0.85 vs 0.77) and fairness (min race Impact Ratio 0.957 vs 0.809). • Off-the-shelf LLMs still propagate measurable demographic bias without safeguards. • The trade-off between accuracy and fairness is a false dichotomy: carefully engineered, domain-tuned models like Eightfold’s can achieve both accuracy of hiring and fairness of outcomes. https://lnkd.in/guQ2TAYp #machinelearning #ai #eightfold #arxiv #datascience #bias #fairness #ml #data #genai #llms
38
2 Comments -
Yossi Matias
Google • 54K followers
We have released TimesFM-2.5, the new leader in the GIFT-Eval on all accuracy metrics among zero-shot foundation models. This improved foundation model is now available on Hugging Face, with an upcoming release on GCP's BigQuery and Model Garden. This release represents a significant advancement. TimesFM-2.5 outperforms TimesFM 2.0 on leading benchmarks by up to 25%, while using half the number of parameters (200M). It also features a longer maximum context length of 16K. 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗧𝗶𝗺𝗲-𝗦𝗲𝗿𝗶𝗲𝘀 𝗙𝗼𝗿𝗲𝗰𝗮𝘀𝘁𝗶𝗻𝗴? Time-series analysis is the process of studying data points collected over time to make predictions and identify trends. It is a critical tool for a wide range of applications, like: ➡️ Forecasting future product demand. ➡️ Tracking weather and precipitation ➡️ Optimizing supply chains and energy grids. 𝗪𝗵𝘆 𝗶𝘀 𝗶𝘁 𝗖𝗵𝗮𝗹𝗹𝗲𝗻𝗴𝗶𝗻𝗴? Time-series forecasting is difficult because data patterns are often complex, can change over time, and are influenced by numerous factors. Developing a single model that can perform well across diverse datasets without being explicitly trained on each one has been a major challenge. 𝗧𝗶𝗺𝗲𝘀𝗙𝗠-𝟮.𝟱: 𝗔 𝗡𝗲𝘄 𝗦𝘁𝗮𝗻𝗱𝗮𝗿𝗱 𝗳𝗼𝗿 𝗭𝗲𝗿𝗼-𝗦𝗵𝗼𝘁 𝗙𝗼𝗿𝗲𝗰𝗮𝘀𝘁𝗶𝗻𝗴 TimesFM-2.5 sets a new standard for a decoder-only foundation model trained on a large time-series corpus. 🤗 Leading Performance: TimesFM-2.5 holds the top position on the GIFT-Eval leaderboard for point forecasting accuracy (MASE) and probabilistic forecasting accuracy (CRPS). 🤗 Efficiency: The model's efficiency is a key feature, with a small parameter count that makes it practical for a wide range of production environments. 🤗 Longer Context: The increased context length allows the model to process more historical data, leading to more accurate forecasts. This work reiterates that building a single foundation model for time series forecasting is possible. Google Research keeps pushing forward the frontier of time series forecasting research. We are grateful to our community and customers who have provided feedback and deployed TimesFM in production. We are interested to hear more about how you are using TimesFM. Read more in our repository and see the leaderboard. GiFT-Eval: https://lnkd.in/dAwAcKA7 GitHub: https://lnkd.in/dWXH7BAm Hugging Face: https://lnkd.in/dtx9iMHE To learn more about the foundational model read the Paper: https://lnkd.in/dRw3zzXT
282
9 Comments -
Michael Brenndoerfer
EQT Group • 5K followers
One of the surprising realities of working with large language models (LLM) is that you can feed in the same prompt and still get different outputs, even when you set temperature = 0. LLMs are fundamentally probabilistic, and while lowering the temperature reduces randomness, it doesn’t guarantee determinism. With greedy decoding, subtle factors like floating-point precision, parallel computation, tie-breaking logic, and model architecture quirks (e.g. Mixture-of-Experts) can cause outputs to diverge across runs. The result: temperature = 0 increases consistency, but doesn’t ensure it. For applications where reproducibility matters, it’s important to understand where this variability comes from and how to mitigate it. If you’ve ever been puzzled by the output differences and want to know why, give it a read and let me know what you think. https://lnkd.in/enpsRH_V
39
3 Comments -
Apoorv Agrawal
Altimeter • 19K followers
Another gem from Ali Ghodsi and Arvind Jain Q: where will value accrue in the AI stack? Data vs LLMs vs Apps Ali: 1) proprietary data is very valuable 2) Intelligence is a commodity, interchangeable 3) Apps is where most value will accrue, like Glean
87
6 Comments -
Greg Kapoustin
If you’re an allocator or… • 1K followers
That even SIP (https://lnkd.in/g8sdJNZ2) data exhibits lookahead bias (time-travel) illustrates why self-reported timestamps for #AlternativeData, signals, and #quant strategies, which often involve material processing and delivery lags, are seldom credible.
1
-
Yochai Korn
Leelafy • 7K followers
Quick question for folks running AI-heavy production systems: Are you doing cascade routing for LLM inference? (Start with cheap models, escalate only when needed) Context: Last week's investor reports, one portfolio company showed 80% reduction in monthly LLM spend. Asked them how—they're using #cascadeflow (OSS, MIT): https://lnkd.in/eXRE568n Their numbers: → Before: $127K/mo → After: $31K/mo → Setup: <1 hour → Quality: No measurable degradation What's interesting: cascades through models during query generation—so you don't have to decide upfront for unpredictable queries or tool calls. Routes dynamically based on actual request complexity. I'm curious about real-world experiences: Savings vs Tradeoffs: • Actual cost reduction % you've seen? • Latency impact from routing overhead? • Quality regression rate? • How are you measuring quality maintenance? Implementation: • Eval methodology for comparing routed vs non-routed responses? • Monitoring approach for catching routing failures? • Fallback strategy when routing breaks? Edge cases: • Multi-turn conversations (context switching between models?) • Streaming responses? • Rate limits across providers? • Security/compliance (data going to multiple providers?) For context: managing 28 portfolio companies, roughly half are AI-heavy. If this holds up, it's material EBITDA improvement across the portfolio. But want to understand gotchas before recommending broadly. Anyone running cascade routing in production? Would love to compare notes in the comments or DMs. #AI #LLM #ProductionAI #CostOptimization #StartupMetrics #LLMRouting #cascadeflow Lemony
44
16 Comments -
Justin Patel
Decasonic • 1K followers
AI is compressing the traditional VC edge (access + information). At Decasonic, we’re building an AI Operating System to compound what matters: learning velocity + conviction, using integrated context, memory, and reinforcement learning so insights persist instead of resetting deal-to-deal. The full framework + roadmap here is in comments. The image below is Decasonic’s AI Technical Stack and Applications: This diagram illustrates how context, model, and memory interact through reinforcement learning.
12
1 Comment
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content