Journey from Entrepreneur to Employee
6 by vortex_ape | 0 comments on Hacker News.
Tuesday, December 31, 2024
Monday, December 30, 2024
Sunday, December 29, 2024
Saturday, December 28, 2024
Friday, December 27, 2024
Thursday, December 26, 2024
New top story on Hacker News: Gondwanaland: The search for a land before (human) time
Gondwanaland: The search for a land before (human) time
7 by bryanrasmussen | 1 comments on Hacker News.
7 by bryanrasmussen | 1 comments on Hacker News.
Wednesday, December 25, 2024
New top story on Hacker News: The Swedish cabin on the frontline of a possible hybrid war
The Swedish cabin on the frontline of a possible hybrid war
19 by Sami_Lehtinen | 2 comments on Hacker News.
19 by Sami_Lehtinen | 2 comments on Hacker News.
Tuesday, December 24, 2024
Monday, December 23, 2024
New top story on Hacker News: Nordstrom Family to Take Company Private in $6.25B Deal
Nordstrom Family to Take Company Private in $6.25B Deal
25 by ivewonyoung | 12 comments on Hacker News.
25 by ivewonyoung | 12 comments on Hacker News.
Sunday, December 22, 2024
Saturday, December 21, 2024
Friday, December 20, 2024
New top story on Hacker News: Ask HN: How do you work with people who are "not quite smart"?
Ask HN: How do you work with people who are "not quite smart"?
3 by charles_f | 4 comments on Hacker News.
Hey, This is a touchy subject, and that might be a lack of awareness or empathy from my part. But trust that it comes from a genuine willingness of making things better for everyone. We all work with people who we find "not as good", have different ways or work ethics. After being told for decades that this is usually a problem with communication or point of view, I had somewhat internalized the idea. And it is often true, but what I've realized as of late, is that there's a category of people who are not just working a different way, but are - to put it bluntly - plainly not smart. What I'm talking about is people below average when it comes to understanding concepts, or conceptualizing altogether. Their intuition is always twisted and wrong. Completely lack critical feedback. Work needs to be decomposed for them in extremely precise steps if you want anything to happen. The type of person where you know anything assigned to them will be badly done. When you open a document or code written by them, you do it with the anticipation that it's gonna be bad in novel ways. And despite all of your efforts to try and coach them, seem to make no progress (where the same coach/coaching on others works). And I know there might be other causes for that, maybe something that happens in their life, lack of interest in the task or motivation overall. But I think I can make a clear distinction between someone who doesn't give a crap, and someone who does but is not equipped to achieve the task at hand. Some of them are direct colleagues whom I can provide feedback about - but then what do you tell them/their manager? "It's very hard to explain you concepts, you should work on conceptualization"? Some of them are transverse. I don't even know what to do about these people. An "experienced" PM who systematically gives you hot garbage, apart from pointing the lack of research and the absence of evidence-based reasoning, how do you tell someone "what you gave me is plain dumb, and that's not just a difference of opinion in how we should approach this product/feature, it's inconsistent and pure shit"? I'm not in a management position, I can't "just fire them". Plus I don't want to go and tell them "you're just not equipped for your job you should find another". It's quite disheartening, does anyone have techniques they've used? Am I just missing something?
3 by charles_f | 4 comments on Hacker News.
Hey, This is a touchy subject, and that might be a lack of awareness or empathy from my part. But trust that it comes from a genuine willingness of making things better for everyone. We all work with people who we find "not as good", have different ways or work ethics. After being told for decades that this is usually a problem with communication or point of view, I had somewhat internalized the idea. And it is often true, but what I've realized as of late, is that there's a category of people who are not just working a different way, but are - to put it bluntly - plainly not smart. What I'm talking about is people below average when it comes to understanding concepts, or conceptualizing altogether. Their intuition is always twisted and wrong. Completely lack critical feedback. Work needs to be decomposed for them in extremely precise steps if you want anything to happen. The type of person where you know anything assigned to them will be badly done. When you open a document or code written by them, you do it with the anticipation that it's gonna be bad in novel ways. And despite all of your efforts to try and coach them, seem to make no progress (where the same coach/coaching on others works). And I know there might be other causes for that, maybe something that happens in their life, lack of interest in the task or motivation overall. But I think I can make a clear distinction between someone who doesn't give a crap, and someone who does but is not equipped to achieve the task at hand. Some of them are direct colleagues whom I can provide feedback about - but then what do you tell them/their manager? "It's very hard to explain you concepts, you should work on conceptualization"? Some of them are transverse. I don't even know what to do about these people. An "experienced" PM who systematically gives you hot garbage, apart from pointing the lack of research and the absence of evidence-based reasoning, how do you tell someone "what you gave me is plain dumb, and that's not just a difference of opinion in how we should approach this product/feature, it's inconsistent and pure shit"? I'm not in a management position, I can't "just fire them". Plus I don't want to go and tell them "you're just not equipped for your job you should find another". It's quite disheartening, does anyone have techniques they've used? Am I just missing something?
Thursday, December 19, 2024
Wednesday, December 18, 2024
Tuesday, December 17, 2024
Monday, December 16, 2024
New top story on Hacker News: Show HN: NCompass Technologies – yet another AI Inference API, but hear us out
Show HN: NCompass Technologies – yet another AI Inference API, but hear us out
3 by adiraja | 5 comments on Hacker News.
Hello HackerNews! I’m excited to share what we’ve been working on at nCompass Technologies: an AI inference platform that gives you a scalable and reliable API to access any open-source AI model — with no rate limits. We don't have rate limits as optimizations we made to our AI model serving software enable us to support a high number of concurrent requests without degrading quality of service for you as a user. If you’re thinking, well aren’t there a bunch of these already? So were we when we started nCompass. When using other APIs, we found that they weren’t reliable enough to be able to use open source models in production environments. To resolve this, we're building an AI inference engine that enable you, as an end user, to reliably use open source models in production. Underlying this API, we’re building optimizations at the hosting, scheduling and kernel levels with the single goal of minimizing the number of GPUs required to maximize the number of concurrent requests you can serve, without degrading quality of service. We’re still building a lot of our optimizations, but we’ve released what we have so far via our API. Compared to vLLM, we currently keep time-to-first-token (TTFT) 2-4x lower than vLLM at the equivalent concurrent request rate. You can check out a demo of our API here: https://ift.tt/WeBS91Q As a result of the optimizations we’ve rolled out so far, we’re releasing a few unique features on our API: 1. Rate-Limits: we don’t have any Most other API’s out there have strict rate limits and can be rather unreliable. We don’t want API’s for open source models to remain as a solution for prototypes only. We want people to use these APIs like they do OpenAI’s or Anthropic’s and actually make production grade products on top of open source models. 2. Underserved models: we have them There are a ton of models out there, but not all of them are readily available for people to use if they don’t have access to GPUs. We envision our API becoming a system where anyone can launch any custom model of their choice with minimal cold starts and run the model as a simple API call. Our cold starts for any 8B or 70B model are only 40s and we’ll keep improving this. Towards this goal, we already have models like `ai4bharat/hercule-hi` hosted on our API to support non-english language use cases and models like `Qwen/QwQ-32B-Preview` to support reasoning based use cases. You can find the other models that we host here: https://ift.tt/pWokNiS. We’d love for you to try out our API by following the steps here: https://ift.tt/0g8aAoV . We provide $100 of free credit on sign up to run models, and like we said, go crazy with your requests, we’d love to see if you can break our system :) We’re still actively building out features and optimizations and your input can help shape the future of nCompass. If you have thoughts on our platform or want us to host a specific model, let us know at hello@ncompass.tech. Happy Hacking!
3 by adiraja | 5 comments on Hacker News.
Hello HackerNews! I’m excited to share what we’ve been working on at nCompass Technologies: an AI inference platform that gives you a scalable and reliable API to access any open-source AI model — with no rate limits. We don't have rate limits as optimizations we made to our AI model serving software enable us to support a high number of concurrent requests without degrading quality of service for you as a user. If you’re thinking, well aren’t there a bunch of these already? So were we when we started nCompass. When using other APIs, we found that they weren’t reliable enough to be able to use open source models in production environments. To resolve this, we're building an AI inference engine that enable you, as an end user, to reliably use open source models in production. Underlying this API, we’re building optimizations at the hosting, scheduling and kernel levels with the single goal of minimizing the number of GPUs required to maximize the number of concurrent requests you can serve, without degrading quality of service. We’re still building a lot of our optimizations, but we’ve released what we have so far via our API. Compared to vLLM, we currently keep time-to-first-token (TTFT) 2-4x lower than vLLM at the equivalent concurrent request rate. You can check out a demo of our API here: https://ift.tt/WeBS91Q As a result of the optimizations we’ve rolled out so far, we’re releasing a few unique features on our API: 1. Rate-Limits: we don’t have any Most other API’s out there have strict rate limits and can be rather unreliable. We don’t want API’s for open source models to remain as a solution for prototypes only. We want people to use these APIs like they do OpenAI’s or Anthropic’s and actually make production grade products on top of open source models. 2. Underserved models: we have them There are a ton of models out there, but not all of them are readily available for people to use if they don’t have access to GPUs. We envision our API becoming a system where anyone can launch any custom model of their choice with minimal cold starts and run the model as a simple API call. Our cold starts for any 8B or 70B model are only 40s and we’ll keep improving this. Towards this goal, we already have models like `ai4bharat/hercule-hi` hosted on our API to support non-english language use cases and models like `Qwen/QwQ-32B-Preview` to support reasoning based use cases. You can find the other models that we host here: https://ift.tt/pWokNiS. We’d love for you to try out our API by following the steps here: https://ift.tt/0g8aAoV . We provide $100 of free credit on sign up to run models, and like we said, go crazy with your requests, we’d love to see if you can break our system :) We’re still actively building out features and optimizations and your input can help shape the future of nCompass. If you have thoughts on our platform or want us to host a specific model, let us know at hello@ncompass.tech. Happy Hacking!
Sunday, December 15, 2024
Saturday, December 14, 2024
Friday, December 13, 2024
New top story on Hacker News: Show HN: I made the slowest, most expensive GPT
Show HN: I made the slowest, most expensive GPT
23 by wluk | 13 comments on Hacker News.
This is another one of my automate-my-life projects - I'm constantly asking the same question to different AIs since there's always the hope of getting a better answer somewhere else. Maybe ChatGPT's answer is too short, so I ask Perplexity. But I realize that's hallucinated, so I try Gemini. That answer sounds right, but I cross-reference with Claude just to make sure. This doesn't really apply to math/coding (where o1 or Gemini can probably one-shot an excellent response), but more to online search, where information is more fluid and there's no "right" search engine + text restructuring + model combination every time. Even o1 doesn't have online search, so it's obviously a hard problem to solve. An example is something like "best ski resorts in the US", which will get a different response from every GPT, but most of their rankings won't reflect actual skiers' consensus - say, on Reddit https://ift.tt/jm8DBXF... - because there's so many opinions floating around, a one-shot RAG search + LLM isn't going to have enough context to find how everyone thinks. And obviously, offline GPTs like o1 and Sonnet/Haiku aren't going to have the latest updates if a resort closes for example. So I’ve spent the last few months experimenting with a new project that's basically the most expensive GPT I’ll ever run. It runs search queries through ChatGPT, Claude, Grok, Perplexity, Gemini, etc., then aggregates the responses. For added financial tragedy, in-between it also uses multiple embedding models and performs iterative RAG searches through different search engines. This all functions as sort of like one giant AI brain. So I pay for every search, then every embedding, then every intermediary LLM input/output, then the final LLM input/output. On average it costs about 10 to 30 cents per search. It's also extremely slow. https://ithy.com I know that sounds absurdly overkill, but that’s kind of the point. The goal is to get the most accurate and comprehensive answer possible, because it's been vetted by a bunch of different AIs, each sourcing from different buckets of websites. Context limits today are just large enough that this type of search and cross-model iteration is possible, where we can determine the "overlap" between a diverse set of text to determine some sort of consensus. The idea is to get online answers that aren't attainable from any single AI. If you end up trying this out, I'd recommend comparing Ithy's output against the other GPTs to see the difference. It's going to cost me a fortune to run this project (I'll probably keep it online for a month or two), but I see it as an exploration of what’s possible with today’s model APIs, rather than something that’s immediately practical. Think of it as an online o1 (without the $200/month price tag, though I'm offering a $29/month Pro plan to help subsidize). If nothing else, it’s a fun (and pricey) thought experiment.
23 by wluk | 13 comments on Hacker News.
This is another one of my automate-my-life projects - I'm constantly asking the same question to different AIs since there's always the hope of getting a better answer somewhere else. Maybe ChatGPT's answer is too short, so I ask Perplexity. But I realize that's hallucinated, so I try Gemini. That answer sounds right, but I cross-reference with Claude just to make sure. This doesn't really apply to math/coding (where o1 or Gemini can probably one-shot an excellent response), but more to online search, where information is more fluid and there's no "right" search engine + text restructuring + model combination every time. Even o1 doesn't have online search, so it's obviously a hard problem to solve. An example is something like "best ski resorts in the US", which will get a different response from every GPT, but most of their rankings won't reflect actual skiers' consensus - say, on Reddit https://ift.tt/jm8DBXF... - because there's so many opinions floating around, a one-shot RAG search + LLM isn't going to have enough context to find how everyone thinks. And obviously, offline GPTs like o1 and Sonnet/Haiku aren't going to have the latest updates if a resort closes for example. So I’ve spent the last few months experimenting with a new project that's basically the most expensive GPT I’ll ever run. It runs search queries through ChatGPT, Claude, Grok, Perplexity, Gemini, etc., then aggregates the responses. For added financial tragedy, in-between it also uses multiple embedding models and performs iterative RAG searches through different search engines. This all functions as sort of like one giant AI brain. So I pay for every search, then every embedding, then every intermediary LLM input/output, then the final LLM input/output. On average it costs about 10 to 30 cents per search. It's also extremely slow. https://ithy.com I know that sounds absurdly overkill, but that’s kind of the point. The goal is to get the most accurate and comprehensive answer possible, because it's been vetted by a bunch of different AIs, each sourcing from different buckets of websites. Context limits today are just large enough that this type of search and cross-model iteration is possible, where we can determine the "overlap" between a diverse set of text to determine some sort of consensus. The idea is to get online answers that aren't attainable from any single AI. If you end up trying this out, I'd recommend comparing Ithy's output against the other GPTs to see the difference. It's going to cost me a fortune to run this project (I'll probably keep it online for a month or two), but I see it as an exploration of what’s possible with today’s model APIs, rather than something that’s immediately practical. Think of it as an online o1 (without the $200/month price tag, though I'm offering a $29/month Pro plan to help subsidize). If nothing else, it’s a fun (and pricey) thought experiment.
Thursday, December 12, 2024
New top story on Hacker News: Show HN: Gentrace – connect to your LLM app code and run/eval it from a UI
Show HN: Gentrace – connect to your LLM app code and run/eval it from a UI
7 by dsaffy | 0 comments on Hacker News.
Hey HN - Doug from Gentrace here. We originally launched via Show HN in August of 2023 as evaluation and observability for generative AI: https://ift.tt/wuPqS7Q Since then, everyone from the model providers to LLM ops companies built a prompt playground. We had one too, until we realized this was totally the wrong approach: - It's not connected to your application code - They don't support all models - You have to rebuild evals for just this one prompt (can't use your end-to-end evals) In other words, it was a ton of work and time to use these to actually make your app better. So, we built a new experience and are relaunching around this idea: Gentrace is a collaborative LLM app testing and experimentation platform that brings together engineers, PMs, subject matter experts, and more to run and test your actual end-to-end app. To do this, use our SDK to: - connect your app to Gentrace as a live runner over websocket (local) / via webhook (staging, prod) - wrap your parameters (eg prompt, model, top-k) so they become tunable knobs in the front end - edit the parameters and then run / evaluate the actual app code with datasets and evals in Gentrace We think it's great for tuning retrieval systems, upgrading models, and iterating on prompts. It's free to trial. Would love to hear your feedback / what you think!
7 by dsaffy | 0 comments on Hacker News.
Hey HN - Doug from Gentrace here. We originally launched via Show HN in August of 2023 as evaluation and observability for generative AI: https://ift.tt/wuPqS7Q Since then, everyone from the model providers to LLM ops companies built a prompt playground. We had one too, until we realized this was totally the wrong approach: - It's not connected to your application code - They don't support all models - You have to rebuild evals for just this one prompt (can't use your end-to-end evals) In other words, it was a ton of work and time to use these to actually make your app better. So, we built a new experience and are relaunching around this idea: Gentrace is a collaborative LLM app testing and experimentation platform that brings together engineers, PMs, subject matter experts, and more to run and test your actual end-to-end app. To do this, use our SDK to: - connect your app to Gentrace as a live runner over websocket (local) / via webhook (staging, prod) - wrap your parameters (eg prompt, model, top-k) so they become tunable knobs in the front end - edit the parameters and then run / evaluate the actual app code with datasets and evals in Gentrace We think it's great for tuning retrieval systems, upgrading models, and iterating on prompts. It's free to trial. Would love to hear your feedback / what you think!
Wednesday, December 11, 2024
Tuesday, December 10, 2024
New top story on Hacker News: Ask HN: Those making $500/month on side projects in 2024 – Show and tell
Ask HN: Those making $500/month on side projects in 2024 – Show and tell
87 by cvbox | 72 comments on Hacker News.
It's the time of the year again, so I'd be interested hear what new (and old) ideas have come up. Previously asked on: 2023 → https://ift.tt/AVPiuM7 2022 → https://ift.tt/EG1Wk09 2021 → https://ift.tt/MR4ym5Z 2020 → https://ift.tt/9hLDFBO 2019 → https://ift.tt/u0mQAlr 2018 → https://ift.tt/wIWJPAH 2017 → https://ift.tt/r1VXNdu
87 by cvbox | 72 comments on Hacker News.
It's the time of the year again, so I'd be interested hear what new (and old) ideas have come up. Previously asked on: 2023 → https://ift.tt/AVPiuM7 2022 → https://ift.tt/EG1Wk09 2021 → https://ift.tt/MR4ym5Z 2020 → https://ift.tt/9hLDFBO 2019 → https://ift.tt/u0mQAlr 2018 → https://ift.tt/wIWJPAH 2017 → https://ift.tt/r1VXNdu
Monday, December 9, 2024
Sunday, December 8, 2024
Saturday, December 7, 2024
Friday, December 6, 2024
Thursday, December 5, 2024
Wednesday, December 4, 2024
New top story on Hacker News: Show HN: I combined spaced repetition with emails so you can remember anything
Show HN: I combined spaced repetition with emails so you can remember anything
15 by iskrataa | 3 comments on Hacker News.
Hey HN, I am a student shipping apps in my free time. This is my 4th for the year! Non-fic books and podcasts have been part of my life for years now but I always struggled with remembering what I’ve read or listened to. I wanted it to stick even after years. My notes list grew large but I never really revisited them. That’s why I created GinkgoNotes. You can enter notes you want to recall and leave it to the app to create a personalised (based on spaced repetition) email schedule. That means you’ll get your notes emailed to you a couple of times exactly when you should read them again (based on Ebbinghaus's Forgetting Curve) so it’s certain that you’ll remember them. I hope this will be helpful as it was for me. Would love some feedback! Iskren
15 by iskrataa | 3 comments on Hacker News.
Hey HN, I am a student shipping apps in my free time. This is my 4th for the year! Non-fic books and podcasts have been part of my life for years now but I always struggled with remembering what I’ve read or listened to. I wanted it to stick even after years. My notes list grew large but I never really revisited them. That’s why I created GinkgoNotes. You can enter notes you want to recall and leave it to the app to create a personalised (based on spaced repetition) email schedule. That means you’ll get your notes emailed to you a couple of times exactly when you should read them again (based on Ebbinghaus's Forgetting Curve) so it’s certain that you’ll remember them. I hope this will be helpful as it was for me. Would love some feedback! Iskren
Tuesday, December 3, 2024
Monday, December 2, 2024
Sunday, December 1, 2024
New top story on Hacker News: Show HN: Vicinity – Fast, Lightweight Nearest Neighbors with Flexible Back Ends
Show HN: Vicinity – Fast, Lightweight Nearest Neighbors with Flexible Back Ends
12 by Pringled | 0 comments on Hacker News.
We’ve just open-sourced Vicinity, a lightweight approximate nearest neighbors (ANN) search package that allows for fast experimentation and comparison of a larger number of well known algorithms. Main features: - Lightweight: the base package only uses Numpy - Unified interface: use any of the supported algorithms and backends with a single interface: HNSW, Annoy, FAISS, and many more algorithms and libraries are supported - Easy evaluation: evaluate the performance of your backend with a simple function to measure queries per second vs recall - Serialization: save and load your index for persistence After working with a large number of ANN libraries over the years, we found it increasingly cumbersome to learn the interface, features, quirks, and limitations of every library. After writing custom evaluation code to measure the speed and performance for the 100th time to compare libraries, we decided to build this as a way to easily use a large number of algorithms and libraries with a unified, simple interface that allows for quick comparison and evaluation. We are curious to hear your feedback! Are there any algorithms that are missing that you use? Any extra evaluation metrics that are useful?
12 by Pringled | 0 comments on Hacker News.
We’ve just open-sourced Vicinity, a lightweight approximate nearest neighbors (ANN) search package that allows for fast experimentation and comparison of a larger number of well known algorithms. Main features: - Lightweight: the base package only uses Numpy - Unified interface: use any of the supported algorithms and backends with a single interface: HNSW, Annoy, FAISS, and many more algorithms and libraries are supported - Easy evaluation: evaluate the performance of your backend with a simple function to measure queries per second vs recall - Serialization: save and load your index for persistence After working with a large number of ANN libraries over the years, we found it increasingly cumbersome to learn the interface, features, quirks, and limitations of every library. After writing custom evaluation code to measure the speed and performance for the 100th time to compare libraries, we decided to build this as a way to easily use a large number of algorithms and libraries with a unified, simple interface that allows for quick comparison and evaluation. We are curious to hear your feedback! Are there any algorithms that are missing that you use? Any extra evaluation metrics that are useful?
Saturday, November 30, 2024
Friday, November 29, 2024
Thursday, November 28, 2024
Wednesday, November 27, 2024
Tuesday, November 26, 2024
Monday, November 25, 2024
New top story on Hacker News: Show HN: Minimal, customizable new tab for Chrome/Firefox
Show HN: Minimal, customizable new tab for Chrome/Firefox
13 by georg-stone | 6 comments on Hacker News.
Hello HN! Flowtide is a project I have been working on for about 2 months now. It is a customizable new tab page for Firefox or Chrome. By default, it is configured to have a minimal amount of features, but it can be configured to include a clock, to-do list, or even soundscapes. Install: https://flowtide.app/ GitHub: https://ift.tt/QJx0X7K
13 by georg-stone | 6 comments on Hacker News.
Hello HN! Flowtide is a project I have been working on for about 2 months now. It is a customizable new tab page for Firefox or Chrome. By default, it is configured to have a minimal amount of features, but it can be configured to include a clock, to-do list, or even soundscapes. Install: https://flowtide.app/ GitHub: https://ift.tt/QJx0X7K
Sunday, November 24, 2024
Saturday, November 23, 2024
Friday, November 22, 2024
Thursday, November 21, 2024
Wednesday, November 20, 2024
Urban voters disappointed in Maharashtra, least number of voters came out from Mumbai, Pune and Thane, see the figures
Maharashtra assembly elections 2024
New Delhi: Voting for the first and last phase of Maharashtra assembly elections took place today. The figures that have come out after the voting was completed at 6 pm are not very encouraging. According to the Election Commission, low voter participation was seen in cities like Mumbai, Pune and Thane. By 5 pm, 58.22 percent voting took place in Maharashtra and 67.59 percent in Jharkhand. At the same time, in Jharkhand, 67.04 percent voting took place on these assembly seats in 2019.
Tuesday, November 19, 2024
Monday, November 18, 2024
New top story on Hacker News: Show HN: FastGraphRAG – Better RAG using good old PageRank
Show HN: FastGraphRAG – Better RAG using good old PageRank
22 by liukidar | 5 comments on Hacker News.
Hey there HN! We’re Antonio, Luca, and Yuhang, and we’re excited to introduce Fast GraphRAG, an open-source RAG approach that leverages knowledge graphs and the 25 years old PageRank for better information retrieval and reasoning. Building a good RAG pipeline these days takes a lot of manual optimizations. Most engineers intuitively start from naive RAG: throw everything in a vector database and hope that semantic search is powerful enough. This can work for use cases where accuracy isn’t too important and hallucinations are tolerable, but it doesn’t work for more difficult queries that involve multi-hop reasoning or more advanced domain understanding. Also, it’s impossible to debug it. To address these limitations, many engineers find themselves adding extra layers like agent-based preprocessing, custom embeddings, reranking mechanisms, and hybrid search strategies. Much like the early days of machine learning when we manually crafted feature vectors to squeeze out marginal gains, building an effective RAG system often becomes an exercise in crafting engineering “hacks.” Earlier this year, Microsoft seeded the idea of using Knowledge Graphs for RAG and published GraphRAG - i.e. RAG with Knowledge Graphs. We believe that there is an incredible potential in this idea, but existing implementations are naive in the way they create and explore the graph. That’s why we developed Fast GraphRAG with a new algorithmic approach using good old PageRank. There are two main challenges when building a reliable RAG system: (1) Data Noise: Real-world data is often messy. Customer support tickets, chat logs, and other conversational data can include a lot of irrelevant information. If you push noisy data into a vector database, you’re likely to get noisy results. (2) Domain Specialization: For complex use cases, a RAG system must understand the domain-specific context. This requires creating representations that capture not just the words but the deeper relationships and structures within the data. Our solution builds on these insights by incorporating knowledge graphs into the RAG pipeline. Knowledge graphs store entities and their relationships, and can help structure data in a way that enables more accurate and context-aware information retrieval. 12 years ago Google announced the knowledge graph we all know about [1]. It was a pioneering move. Now we have LLMs, meaning that people can finally do RAG on their own data with tools that can be as powerful as Google’s original idea. Before we built this, Antonio was at Amazon, while Luca and Yuhang were finishing their PhDs at Oxford. We had been thinking about this problem for years and we always loved the parallel between pagerank and the human memory [2]. We believe that searching for memories is incredibly similar to searching the web. Here’s how it works: - Entity and Relationship Extraction: Fast GraphRAG uses LLMs to extract entities and their relationships from your data and stores them in a graph format [3]. - Query Processing: When you make a query, Fast GraphRAG starts by finding the most relevant entities using vector search, then runs a personalized PageRank algorithm to determine the most important “memories” or pieces of information related to the query [4]. - Incremental Updates: Unlike other graph-based RAG systems, Fast GraphRAG natively supports incremental data insertions. This means you can continuously add new data without reprocessing the entire graph. - Faster: These design choices make our algorithm faster and more affordable to run than other graph-based RAG systems because we eliminate the need for communities and clustering. Suppose you’re analyzing a book and want to focus on character interactions, locations, and significant events: from fast_graphrag import GraphRAG DOMAIN = "Analyze this story and identify the characters. Focus on how they interact with each other, the locations they explore, and their relationships." EXAMPLE_QUERIES = [ "What is the significance of Christmas Eve in A Christmas Carol?", "How does the setting of Victorian London contribute to the story's themes?", "Describe the chain of events that leads to Scrooge's transformation.", "How does Dickens use the different spirits (Past, Present, and Future) to guide Scrooge?", "Why does Dickens choose to divide the story into \"staves\" rather than chapters?" ] ENTITY_TYPES = ["Character", "Animal", "Place", "Object", "Activity", "Event"] grag = GraphRAG( working_dir="./book_example", domain=DOMAIN, example_queries="\n".join(EXAMPLE_QUERIES), entity_types=ENTITY_TYPES ) with open("./book.txt") as f: grag.insert(f.read()) print(grag.query("Who is Scrooge?").response) This code creates a domain-specific knowledge graph based on your data, example queries, and specified entity types. Then you can query it in plain English while it automatically handles all the data fetching, entity extractions, co-reference resolutions, memory elections, etc. When you add new data, locking and checkpointing is handled for you as well. This is the kind of infrastructure that GenAI apps need to handle large-scale real-world data. Our goal is to give you this infrastructure so that you can focus on what’s important: building great apps for your users without having to care about manually engineering a retrieval pipeline. In the managed service, we also have a suite of UI tools for you to explore and debug your knowledge graph. We have a free hosted solution with up to 100 monthly requests. When you’re ready to grow, we have paid plans that scale with you. And of course you can self host our open-source engine. Give us a spin today at https://circlemind.co and see our code at https://ift.tt/lXzjWo8 We’d love feedback :) [1] https://ift.tt/Ow8FjoM... [2] Griffiths, T. L., Steyvers, M., & Firl, A. (2007). Google and the Mind: Predicting Fluency with PageRank. Psychological Science, 18(12), 1069–1076. https://ift.tt/OZ0R9fb [3] Similarly to Microsoft’s GraphRAG: https://ift.tt/W6YFs4a [4] Similarly to OSU’s HippoRAG: https://ift.tt/numkr9D https://ift.tt/a0C84ek
22 by liukidar | 5 comments on Hacker News.
Hey there HN! We’re Antonio, Luca, and Yuhang, and we’re excited to introduce Fast GraphRAG, an open-source RAG approach that leverages knowledge graphs and the 25 years old PageRank for better information retrieval and reasoning. Building a good RAG pipeline these days takes a lot of manual optimizations. Most engineers intuitively start from naive RAG: throw everything in a vector database and hope that semantic search is powerful enough. This can work for use cases where accuracy isn’t too important and hallucinations are tolerable, but it doesn’t work for more difficult queries that involve multi-hop reasoning or more advanced domain understanding. Also, it’s impossible to debug it. To address these limitations, many engineers find themselves adding extra layers like agent-based preprocessing, custom embeddings, reranking mechanisms, and hybrid search strategies. Much like the early days of machine learning when we manually crafted feature vectors to squeeze out marginal gains, building an effective RAG system often becomes an exercise in crafting engineering “hacks.” Earlier this year, Microsoft seeded the idea of using Knowledge Graphs for RAG and published GraphRAG - i.e. RAG with Knowledge Graphs. We believe that there is an incredible potential in this idea, but existing implementations are naive in the way they create and explore the graph. That’s why we developed Fast GraphRAG with a new algorithmic approach using good old PageRank. There are two main challenges when building a reliable RAG system: (1) Data Noise: Real-world data is often messy. Customer support tickets, chat logs, and other conversational data can include a lot of irrelevant information. If you push noisy data into a vector database, you’re likely to get noisy results. (2) Domain Specialization: For complex use cases, a RAG system must understand the domain-specific context. This requires creating representations that capture not just the words but the deeper relationships and structures within the data. Our solution builds on these insights by incorporating knowledge graphs into the RAG pipeline. Knowledge graphs store entities and their relationships, and can help structure data in a way that enables more accurate and context-aware information retrieval. 12 years ago Google announced the knowledge graph we all know about [1]. It was a pioneering move. Now we have LLMs, meaning that people can finally do RAG on their own data with tools that can be as powerful as Google’s original idea. Before we built this, Antonio was at Amazon, while Luca and Yuhang were finishing their PhDs at Oxford. We had been thinking about this problem for years and we always loved the parallel between pagerank and the human memory [2]. We believe that searching for memories is incredibly similar to searching the web. Here’s how it works: - Entity and Relationship Extraction: Fast GraphRAG uses LLMs to extract entities and their relationships from your data and stores them in a graph format [3]. - Query Processing: When you make a query, Fast GraphRAG starts by finding the most relevant entities using vector search, then runs a personalized PageRank algorithm to determine the most important “memories” or pieces of information related to the query [4]. - Incremental Updates: Unlike other graph-based RAG systems, Fast GraphRAG natively supports incremental data insertions. This means you can continuously add new data without reprocessing the entire graph. - Faster: These design choices make our algorithm faster and more affordable to run than other graph-based RAG systems because we eliminate the need for communities and clustering. Suppose you’re analyzing a book and want to focus on character interactions, locations, and significant events: from fast_graphrag import GraphRAG DOMAIN = "Analyze this story and identify the characters. Focus on how they interact with each other, the locations they explore, and their relationships." EXAMPLE_QUERIES = [ "What is the significance of Christmas Eve in A Christmas Carol?", "How does the setting of Victorian London contribute to the story's themes?", "Describe the chain of events that leads to Scrooge's transformation.", "How does Dickens use the different spirits (Past, Present, and Future) to guide Scrooge?", "Why does Dickens choose to divide the story into \"staves\" rather than chapters?" ] ENTITY_TYPES = ["Character", "Animal", "Place", "Object", "Activity", "Event"] grag = GraphRAG( working_dir="./book_example", domain=DOMAIN, example_queries="\n".join(EXAMPLE_QUERIES), entity_types=ENTITY_TYPES ) with open("./book.txt") as f: grag.insert(f.read()) print(grag.query("Who is Scrooge?").response) This code creates a domain-specific knowledge graph based on your data, example queries, and specified entity types. Then you can query it in plain English while it automatically handles all the data fetching, entity extractions, co-reference resolutions, memory elections, etc. When you add new data, locking and checkpointing is handled for you as well. This is the kind of infrastructure that GenAI apps need to handle large-scale real-world data. Our goal is to give you this infrastructure so that you can focus on what’s important: building great apps for your users without having to care about manually engineering a retrieval pipeline. In the managed service, we also have a suite of UI tools for you to explore and debug your knowledge graph. We have a free hosted solution with up to 100 monthly requests. When you’re ready to grow, we have paid plans that scale with you. And of course you can self host our open-source engine. Give us a spin today at https://circlemind.co and see our code at https://ift.tt/lXzjWo8 We’d love feedback :) [1] https://ift.tt/Ow8FjoM... [2] Griffiths, T. L., Steyvers, M., & Firl, A. (2007). Google and the Mind: Predicting Fluency with PageRank. Psychological Science, 18(12), 1069–1076. https://ift.tt/OZ0R9fb [3] Similarly to Microsoft’s GraphRAG: https://ift.tt/W6YFs4a [4] Similarly to OSU’s HippoRAG: https://ift.tt/numkr9D https://ift.tt/a0C84ek
Sunday, November 17, 2024
Saturday, November 16, 2024
Friday, November 15, 2024
Thursday, November 14, 2024
Wednesday, November 13, 2024
New top story on Hacker News: Show HN: Konga Beat – A custom track editor for Donkey Konga 2 and 3
Show HN: Konga Beat – A custom track editor for Donkey Konga 2 and 3
31 by CIARobotFish | 7 comments on Hacker News.
Howdy HN! For those who don't know, back in the early 2000s, Nintendo and Namco developed a series of music rhythm games for the GameCube featuring Donkey Kong called Donkey Konga: https://ift.tt/RhCuPST The Donkey Konga games borrowed heavily from Taiko no Tatsujin (another music rhythm game by Namco). However, instead of taiko drums, the player would use DK Bongos to jam along with music from different eras and genres. Long story short, I figured out how to add custom tracks to some of the Donkey Konga games (Donkey Konga 2 and 3) but found the entire process cumbersome, so I decided to make a dedicated editor. It was a lot of fun to make, and I hope others get some enjoyment out of it too!
31 by CIARobotFish | 7 comments on Hacker News.
Howdy HN! For those who don't know, back in the early 2000s, Nintendo and Namco developed a series of music rhythm games for the GameCube featuring Donkey Kong called Donkey Konga: https://ift.tt/RhCuPST The Donkey Konga games borrowed heavily from Taiko no Tatsujin (another music rhythm game by Namco). However, instead of taiko drums, the player would use DK Bongos to jam along with music from different eras and genres. Long story short, I figured out how to add custom tracks to some of the Donkey Konga games (Donkey Konga 2 and 3) but found the entire process cumbersome, so I decided to make a dedicated editor. It was a lot of fun to make, and I hope others get some enjoyment out of it too!
Tuesday, November 12, 2024
New top story on Hacker News: Large Language Models in National Security Applications
Large Language Models in National Security Applications
34 by bindidwodtj | 9 comments on Hacker News.
34 by bindidwodtj | 9 comments on Hacker News.
Monday, November 11, 2024
Sunday, November 10, 2024
Saturday, November 9, 2024
Friday, November 8, 2024
New top story on Hacker News: Pirating "The Pirate Bay" TV Series Is Ironically Difficult
Pirating "The Pirate Bay" TV Series Is Ironically Difficult
20 by HieronymusBosch | 5 comments on Hacker News.
20 by HieronymusBosch | 5 comments on Hacker News.
Thursday, November 7, 2024
Wednesday, November 6, 2024
New top story on Hacker News: Launch HN: Midship (YC S24) – Turn unstructured documents into usable data
Launch HN: Midship (YC S24) – Turn unstructured documents into usable data
6 by maxmaio | 1 comments on Hacker News.
Hey HN, we are Max, Kieran, and Aahel from Midship ( https://midship.ai ). Midship makes it easy to extract data from unstructured documents like pdfs and images. Here’s a video showing it in action: https://ift.tt/W4wFRue?... , and a demo playground (no signup required!) to test it out: https://ift.tt/QRsAd1b We started 5 months ago initially trying to make an AI natural language workflow builder that would be a simpler alternative to Zapier or Make.com. However, most of our users seemed to be much more interested in the basic (and not very good) document extraction feature we had. Seeing how people were spending hours a day manually extracting data from pdfs inspired us to build what has become Midship! The problem is that despite all our progress in software, huge amounts of business data still lives in PDFs and images. Sure, you can OCR them, but getting clean, structured data out is still painful. Most existing tools just give you a blob of markdown - leaving you to figure out which parts matter and how they relate. We've found that combining OCR with language models lets us do something more useful: extract specific fields and tables that users actually care about. The LLMs help correct OCR mistakes and understand context (like knowing that "Inv#" and "Invoice Number" mean the same thing). We have two main kinds of users today, non-technical users that extract data via our web app and developers who use our extraction api. We were initially focused on the first one as they seemed like an underserved part of the market, but we’ve received a lot of interest from developers who face the same issues. For pricing, we currently charge a monthly Saas fee per seat for the web app and a volume based pricing for the API. We’re really excited to share what we’ve built so far and look forward to any feedback from the community!
6 by maxmaio | 1 comments on Hacker News.
Hey HN, we are Max, Kieran, and Aahel from Midship ( https://midship.ai ). Midship makes it easy to extract data from unstructured documents like pdfs and images. Here’s a video showing it in action: https://ift.tt/W4wFRue?... , and a demo playground (no signup required!) to test it out: https://ift.tt/QRsAd1b We started 5 months ago initially trying to make an AI natural language workflow builder that would be a simpler alternative to Zapier or Make.com. However, most of our users seemed to be much more interested in the basic (and not very good) document extraction feature we had. Seeing how people were spending hours a day manually extracting data from pdfs inspired us to build what has become Midship! The problem is that despite all our progress in software, huge amounts of business data still lives in PDFs and images. Sure, you can OCR them, but getting clean, structured data out is still painful. Most existing tools just give you a blob of markdown - leaving you to figure out which parts matter and how they relate. We've found that combining OCR with language models lets us do something more useful: extract specific fields and tables that users actually care about. The LLMs help correct OCR mistakes and understand context (like knowing that "Inv#" and "Invoice Number" mean the same thing). We have two main kinds of users today, non-technical users that extract data via our web app and developers who use our extraction api. We were initially focused on the first one as they seemed like an underserved part of the market, but we’ve received a lot of interest from developers who face the same issues. For pricing, we currently charge a monthly Saas fee per seat for the web app and a volume based pricing for the API. We’re really excited to share what we’ve built so far and look forward to any feedback from the community!
Tuesday, November 5, 2024
Monday, November 4, 2024
Sunday, November 3, 2024
Saturday, November 2, 2024
Friday, November 1, 2024
Thursday, October 31, 2024
Wednesday, October 30, 2024
New top story on Hacker News: Creating a LLM-as-a-Judge That Drives Business Results
Creating a LLM-as-a-Judge That Drives Business Results
16 by thenameless7741 | 0 comments on Hacker News.
16 by thenameless7741 | 0 comments on Hacker News.
Tuesday, October 29, 2024
Monday, October 28, 2024
Sunday, October 27, 2024
Saturday, October 26, 2024
Friday, October 25, 2024
Thursday, October 24, 2024
Wednesday, October 23, 2024
Tuesday, October 22, 2024
Monday, October 21, 2024
Sunday, October 20, 2024
New top story on Hacker News: Show HN: HN Update – Hourly News Broadcast of Top HN Stories
Show HN: HN Update – Hourly News Broadcast of Top HN Stories
4 by yunusabd | 4 comments on Hacker News.
I feel like it was inevitable, with the recent buzz around NotebookLM. I'm just surprised that it hasn't been done yet.
4 by yunusabd | 4 comments on Hacker News.
I feel like it was inevitable, with the recent buzz around NotebookLM. I'm just surprised that it hasn't been done yet.
Saturday, October 19, 2024
Friday, October 18, 2024
Thursday, October 17, 2024
New top story on Hacker News: Ask HN: Why is there not more concern about the physical security of Cloudflare?
Ask HN: Why is there not more concern about the physical security of Cloudflare?
27 by dtquad | 23 comments on Hacker News.
Using Hetzner and Azure, we trust that our unencrypted in-memory data and business logic are housed in professional data centers with strong physical security measures. However, Cloudflare has built its Workers and serverless offerings on top of its Cache/CDN and anti-DDoS infrastructure, which operates out of questionable ISP and IXP colocation facilities in various jurisdictions with dubious standards. As an EU-based company, whenever we ask Cloudflare about the physical security of their edge locations, they consistently refer to encryption in transit and at rest—measures that do nothing to address threats like RAM interception or other physical security vulnerabilities in these questionable facilities. Moreover, when we raise these concerns, they attempt to upsell us on their Enterprise EU/FedRAMP offerings. Cloudflare has also deliberately restricted our ability to block non-Enterprise Workers, KV, and R2 from specific regions, leaving us with limited control over where our data is processed.
27 by dtquad | 23 comments on Hacker News.
Using Hetzner and Azure, we trust that our unencrypted in-memory data and business logic are housed in professional data centers with strong physical security measures. However, Cloudflare has built its Workers and serverless offerings on top of its Cache/CDN and anti-DDoS infrastructure, which operates out of questionable ISP and IXP colocation facilities in various jurisdictions with dubious standards. As an EU-based company, whenever we ask Cloudflare about the physical security of their edge locations, they consistently refer to encryption in transit and at rest—measures that do nothing to address threats like RAM interception or other physical security vulnerabilities in these questionable facilities. Moreover, when we raise these concerns, they attempt to upsell us on their Enterprise EU/FedRAMP offerings. Cloudflare has also deliberately restricted our ability to block non-Enterprise Workers, KV, and R2 from specific regions, leaving us with limited control over where our data is processed.
Wednesday, October 16, 2024
Tuesday, October 15, 2024
Monday, October 14, 2024
Sunday, October 13, 2024
Saturday, October 12, 2024
Friday, October 11, 2024
Thursday, October 10, 2024
Wednesday, October 9, 2024
New top story on Hacker News: Show HN: Donobu – Mac App for Web Automation and Testing
Show HN: Donobu – Mac App for Web Automation and Testing
23 by wewtyflakes | 2 comments on Hacker News.
Been working on a desktop app for Mac that lets you create web flows and rerun them ( https://www.donobu.com/ ). You can optionally use AI (BYOK: bring your own keys) to create flows for you and to do other interesting things, like making vision-based semantic assertions. Also, your data lives on your own filesystem, and we do not see any of it (further still, there is no phoning home at all). A nice benefit of this being a desktop app rather than a SAAS product, is that if you happen to be developing/iterating on a webpage locally, this has no problem hooking into it. What this intends to be a good fit for: - Testing web pages, especially locally. - Exploring random webpages with a stated objective. - Automating tedious flows. Rerunning a flow won't get caught up on using a single selector (many websites randomize element IDs, for instance), there is smart failover using a prioritized list of selectors. - Getting a quick draft of an end-to-end test in Javascript. What this is a bad fit for: - Mass web scraping (too slow). - Adversarial websites. What we are still working out: - Click-and-drag operations. - Websites that are primarily controlled from canvas. - Smoothing out UI/UX (we are two backend engineers trying our best, and are handedly outgunned by real frontend engineers). Fun things to try: - Asking it to assert that a webpage has a certain theme. - Asking it to run an accessibility report for a page (uses https://ift.tt/3CnP94J ). - Asking it to run a cookie report for a page. The tech: - Java 21 for the main business logic. - Javalin 6 for the web framework ( https://javalin.io/ ). - Playwright for controlling the browser ( https://ift.tt/w6UnerK ). - Axe for running accessibility reports ( https://ift.tt/3CnP94J ). Critical feedback is welcome. Thanks for trying it out! Cheers, -Justin and Vaz
23 by wewtyflakes | 2 comments on Hacker News.
Been working on a desktop app for Mac that lets you create web flows and rerun them ( https://www.donobu.com/ ). You can optionally use AI (BYOK: bring your own keys) to create flows for you and to do other interesting things, like making vision-based semantic assertions. Also, your data lives on your own filesystem, and we do not see any of it (further still, there is no phoning home at all). A nice benefit of this being a desktop app rather than a SAAS product, is that if you happen to be developing/iterating on a webpage locally, this has no problem hooking into it. What this intends to be a good fit for: - Testing web pages, especially locally. - Exploring random webpages with a stated objective. - Automating tedious flows. Rerunning a flow won't get caught up on using a single selector (many websites randomize element IDs, for instance), there is smart failover using a prioritized list of selectors. - Getting a quick draft of an end-to-end test in Javascript. What this is a bad fit for: - Mass web scraping (too slow). - Adversarial websites. What we are still working out: - Click-and-drag operations. - Websites that are primarily controlled from canvas. - Smoothing out UI/UX (we are two backend engineers trying our best, and are handedly outgunned by real frontend engineers). Fun things to try: - Asking it to assert that a webpage has a certain theme. - Asking it to run an accessibility report for a page (uses https://ift.tt/3CnP94J ). - Asking it to run a cookie report for a page. The tech: - Java 21 for the main business logic. - Javalin 6 for the web framework ( https://javalin.io/ ). - Playwright for controlling the browser ( https://ift.tt/w6UnerK ). - Axe for running accessibility reports ( https://ift.tt/3CnP94J ). Critical feedback is welcome. Thanks for trying it out! Cheers, -Justin and Vaz
Tuesday, October 8, 2024
Monday, October 7, 2024
Sunday, October 6, 2024
Saturday, October 5, 2024
Friday, October 4, 2024
Thursday, October 3, 2024
Wednesday, October 2, 2024
Tuesday, October 1, 2024
Monday, September 30, 2024
Sunday, September 29, 2024
Saturday, September 28, 2024
New top story on Hacker News: Show HN: Bringing multithreading to Python's async event loop
Show HN: Bringing multithreading to Python's async event loop
11 by nbsande | 1 comments on Hacker News.
This project explores the integration of multithreading into the asyncio event loop in Python. While this was initially built with enhancing CPU utilization for FastAPI servers in mind, the approach can be used with more general async programs too. If you’re interested in diving deeper into the details, I’ve written a blog post about it here: https://ift.tt/BGzafih
11 by nbsande | 1 comments on Hacker News.
This project explores the integration of multithreading into the asyncio event loop in Python. While this was initially built with enhancing CPU utilization for FastAPI servers in mind, the approach can be used with more general async programs too. If you’re interested in diving deeper into the details, I’ve written a blog post about it here: https://ift.tt/BGzafih
New top story on Hacker News: Show HN: Modern Benchmarking Tooling for JavaScript
Show HN: Modern Benchmarking Tooling for JavaScript
7 by evnwashere | 2 comments on Hacker News.
I always had a sweet tooth for how easy it is to use google/benchmark, but when working with js, current libraries didn't feel right and some were not even accurate enough, so I decided to create my own library to make JavaScript benchmarking tooling better. With more free time, I finally implemented all features I wished for in 1.0.0 and made a lightweight C++ single-header version for moments when google/benchmark is too much. Hope this library helps you as much as it does me.
7 by evnwashere | 2 comments on Hacker News.
I always had a sweet tooth for how easy it is to use google/benchmark, but when working with js, current libraries didn't feel right and some were not even accurate enough, so I decided to create my own library to make JavaScript benchmarking tooling better. With more free time, I finally implemented all features I wished for in 1.0.0 and made a lightweight C++ single-header version for moments when google/benchmark is too much. Hope this library helps you as much as it does me.
New top story on Hacker News: Autossh – automatically restart SSH sessions and tunnels
Autossh – automatically restart SSH sessions and tunnels
26 by denysonique | 5 comments on Hacker News.
26 by denysonique | 5 comments on Hacker News.
Friday, September 27, 2024
Thursday, September 26, 2024
Wednesday, September 25, 2024
Tuesday, September 24, 2024
Monday, September 23, 2024
New top story on Hacker News: Launch HN: Panora (YC S24) – Data Integration API for LLMs
Launch HN: Panora (YC S24) – Data Integration API for LLMs
7 by nael_ob | 0 comments on Hacker News.
Hey HN! We're Nael and Rachid, and we're building Panora ( https://ift.tt/IucP84n ), an open-source API that connects various data sources to LLMs, from 3rd party integrations to embeddings and chunking generation. Here's a demo: https://www.youtube.com/watch?v=45QaN8mzAfg , and you can check our docs here: https://ift.tt/XumNG7j Our GitHub repo is at https://ift.tt/IucP84n . Building integrations by hand is tedious and time-consuming. You must adapt to API documentation quirks, manage request retries, OAuth/API key authorization, refresh tokens, rate limits, and data sync freshness. Moreover, you have to keep up with the constant rise of embedding models and chunking capabilities. On the other hand, with the rise of AI-powered apps, you have to handle embedding and chunking of all the unstructured data. The dominant player in this space is Merge.dev, but it has several drawbacks: 1. It's a black box for most developers, lacking transparency on data handling. 2. Strong vendor lock-in: once an end-user connects their software, it's challenging to access authorization tokens if you want to perform requests on their behalf after leaving Merge. 3. Long time-to-deploy for the long tail of integrations, leading to lost opportunities as integrations become the backbone of LLM-based applications. 4. Unrealistic prices per connection (action of one end-user connecting their tool). 5. Not positioned to serve LLM-based products that need RAG-ready data to power their use cases. That's how Panora was born. We set out to build a solution that addresses these pain points head-on, creating something that is both developer-friendly and open-source. Our goal was to simplify the complex world of integrations and data preparation for LLMs, allowing developers to focus on building great products rather than wrestling with integration headaches. Panora is 100% open-source under the Apache 2.0 license and you can either use our cloud version or self-host the product. We provide two ways for your end-users to connect their software seamlessly. 1. A frontend SDK (React) where you can embed the integrations catalog within your app. 2. A magic link that you can share with anyone allowing them to connect their software. You can either use your own OAuth clients or our managed ones. You receive a connection token per user and per provider connected, which you must use to retrieve/insert data using our universal API. We have different categories of software such as CRMs or File storage. Every category is divided into entities (e.g: File Storage has File, Folder, Drive, Group & User) following a standard data model. You even have access to remote data (non-transformed data from the provider) within each response, so you can build custom & complex integrations on your end. If the remote data isn't enough beyond the standard data model, you can create custom fields either via API or our dashboard to map your remote fields to our model. We're more than just integrations—we provide ready data for your RAG applications with auto-generation of embeddings and chunks for all your synced documents. You have the option to select your own vector database and embedding model in the dashboard. We then sync your documents and store the chunks/embeddings to the specified vector DB. We make sure to maintain up-to-date data that we send through webhooks, and you can set custom sync frequency (1hr, once a day, etc.) depending on your use case. Developers use our API to access fragmented data across various software such as File storage systems (Google Drive, OneDrive, SharePoint) and retrieve the embeddings of their documents using a single API. Our backend SDK is available for Python, TypeScript, Ruby, and Go. Your honest feedback, suggestions, and wishes would be very helpful. We'd love to hear about your integration stories, challenges you've faced with data integration for LLMs, and any thoughts on our approach. Thanks, HN!
7 by nael_ob | 0 comments on Hacker News.
Hey HN! We're Nael and Rachid, and we're building Panora ( https://ift.tt/IucP84n ), an open-source API that connects various data sources to LLMs, from 3rd party integrations to embeddings and chunking generation. Here's a demo: https://www.youtube.com/watch?v=45QaN8mzAfg , and you can check our docs here: https://ift.tt/XumNG7j Our GitHub repo is at https://ift.tt/IucP84n . Building integrations by hand is tedious and time-consuming. You must adapt to API documentation quirks, manage request retries, OAuth/API key authorization, refresh tokens, rate limits, and data sync freshness. Moreover, you have to keep up with the constant rise of embedding models and chunking capabilities. On the other hand, with the rise of AI-powered apps, you have to handle embedding and chunking of all the unstructured data. The dominant player in this space is Merge.dev, but it has several drawbacks: 1. It's a black box for most developers, lacking transparency on data handling. 2. Strong vendor lock-in: once an end-user connects their software, it's challenging to access authorization tokens if you want to perform requests on their behalf after leaving Merge. 3. Long time-to-deploy for the long tail of integrations, leading to lost opportunities as integrations become the backbone of LLM-based applications. 4. Unrealistic prices per connection (action of one end-user connecting their tool). 5. Not positioned to serve LLM-based products that need RAG-ready data to power their use cases. That's how Panora was born. We set out to build a solution that addresses these pain points head-on, creating something that is both developer-friendly and open-source. Our goal was to simplify the complex world of integrations and data preparation for LLMs, allowing developers to focus on building great products rather than wrestling with integration headaches. Panora is 100% open-source under the Apache 2.0 license and you can either use our cloud version or self-host the product. We provide two ways for your end-users to connect their software seamlessly. 1. A frontend SDK (React) where you can embed the integrations catalog within your app. 2. A magic link that you can share with anyone allowing them to connect their software. You can either use your own OAuth clients or our managed ones. You receive a connection token per user and per provider connected, which you must use to retrieve/insert data using our universal API. We have different categories of software such as CRMs or File storage. Every category is divided into entities (e.g: File Storage has File, Folder, Drive, Group & User) following a standard data model. You even have access to remote data (non-transformed data from the provider) within each response, so you can build custom & complex integrations on your end. If the remote data isn't enough beyond the standard data model, you can create custom fields either via API or our dashboard to map your remote fields to our model. We're more than just integrations—we provide ready data for your RAG applications with auto-generation of embeddings and chunks for all your synced documents. You have the option to select your own vector database and embedding model in the dashboard. We then sync your documents and store the chunks/embeddings to the specified vector DB. We make sure to maintain up-to-date data that we send through webhooks, and you can set custom sync frequency (1hr, once a day, etc.) depending on your use case. Developers use our API to access fragmented data across various software such as File storage systems (Google Drive, OneDrive, SharePoint) and retrieve the embeddings of their documents using a single API. Our backend SDK is available for Python, TypeScript, Ruby, and Go. Your honest feedback, suggestions, and wishes would be very helpful. We'd love to hear about your integration stories, challenges you've faced with data integration for LLMs, and any thoughts on our approach. Thanks, HN!
Sunday, September 22, 2024
Saturday, September 21, 2024
Friday, September 20, 2024
New top story on Hacker News: Show HN: EloqKV – Scalable distributed ACID key-value database with Redis API
Show HN: EloqKV – Scalable distributed ACID key-value database with Redis API
11 by hubertzhang | 19 comments on Hacker News.
We're thrilled to unveil EloqKV, a lightning-fast distributed key-value store with a Redis-compatible API. Built on a new database architecture called the Data Substrate, EloqKV brings significant innovations to database design. Here’s the unique features that makes it stand out: - Flexible Deployment: Run it as a single-node in-memory KV cache, a larger-than-memory database or scale to a highly available, distributed transactional database with ease. - High Performance: Achieves performance levels comparable to top in-memory databases like Redis and DragonflyDB, while significantly outperforming durable KV stores like KVRocks. - Full ACID Transactions: Ensures complete transactional integrity, even in distributed environments. - Independent Resource Scaling: Scale CPU, memory, storage, and logging resources independently to meet your needs. We’d love to hear your thoughts and feedback!
11 by hubertzhang | 19 comments on Hacker News.
We're thrilled to unveil EloqKV, a lightning-fast distributed key-value store with a Redis-compatible API. Built on a new database architecture called the Data Substrate, EloqKV brings significant innovations to database design. Here’s the unique features that makes it stand out: - Flexible Deployment: Run it as a single-node in-memory KV cache, a larger-than-memory database or scale to a highly available, distributed transactional database with ease. - High Performance: Achieves performance levels comparable to top in-memory databases like Redis and DragonflyDB, while significantly outperforming durable KV stores like KVRocks. - Full ACID Transactions: Ensures complete transactional integrity, even in distributed environments. - Independent Resource Scaling: Scale CPU, memory, storage, and logging resources independently to meet your needs. We’d love to hear your thoughts and feedback!
Thursday, September 19, 2024
New top story on Hacker News: Ask HN: What email service(s) do you use for your side projects?
Ask HN: What email service(s) do you use for your side projects?
13 by jtap | 19 comments on Hacker News.
I have a couple side projects that I use for my friends, family, and myself. I'd like to have both an email such as team@mysite.com to send and receive emails that I might want to type out. I'd also like to be able to send transactional emails, password reset ... I would think that I'm not the only one with this problem. What do you all use to achieve this?
13 by jtap | 19 comments on Hacker News.
I have a couple side projects that I use for my friends, family, and myself. I'd like to have both an email such as team@mysite.com to send and receive emails that I might want to type out. I'd also like to be able to send transactional emails, password reset ... I would think that I'm not the only one with this problem. What do you all use to achieve this?
Wednesday, September 18, 2024
Tuesday, September 17, 2024
Monday, September 16, 2024
New top story on Hacker News: Ask HN: What runs L4-related microkernels/hypervisors these days?
Ask HN: What runs L4-related microkernels/hypervisors these days?
19 by AlexWandell | 5 comments on Hacker News.
I've been learning about the L4 microkernel, and am thinking about doing something related to it for a research project. I'm especially curious about more recent examples of specific devices that run L4 variants (seL4, PikeOS, OKL4, etc.). I already found a few that use seL4, but to take OKL4 as an example, most of the specific devices I could find are from more than a decade ago, and I'm trying to find things from at least the last 5 or 6 years. I'm even more curious to find devices that use a form of L4 as a hypervisor. Has anyone here worked on a device that used an L4-related kernel or hypervisor? I know one major area they're used in is defense and full of NDAs, but hopefully some of the other industries they're used in (medical devices, automotive, IoT) are a little less restrictive. Thanks in advance!
19 by AlexWandell | 5 comments on Hacker News.
I've been learning about the L4 microkernel, and am thinking about doing something related to it for a research project. I'm especially curious about more recent examples of specific devices that run L4 variants (seL4, PikeOS, OKL4, etc.). I already found a few that use seL4, but to take OKL4 as an example, most of the specific devices I could find are from more than a decade ago, and I'm trying to find things from at least the last 5 or 6 years. I'm even more curious to find devices that use a form of L4 as a hypervisor. Has anyone here worked on a device that used an L4-related kernel or hypervisor? I know one major area they're used in is defense and full of NDAs, but hopefully some of the other industries they're used in (medical devices, automotive, IoT) are a little less restrictive. Thanks in advance!
Sunday, September 15, 2024
Saturday, September 14, 2024
Friday, September 13, 2024
Thursday, September 12, 2024
Wednesday, September 11, 2024
Tuesday, September 10, 2024
Monday, September 9, 2024
New top story on Hacker News: Ask HN: How do you manage your prompts in ChatGPT?
Ask HN: How do you manage your prompts in ChatGPT?
10 by nabi_nafio | 5 comments on Hacker News.
I use ChatGPT regularly for a lot of different tasks. For example, coding, health Q&A, and summarizing docs. The different prompts stack up in the sidebar which becomes very difficult to manage. For example, I frequently have to refer back to a prompt that I wrote previously. But I usually give up looking for it because of the tedious scroll and search process. I was wondering if there is an easier way. How do you manage your prompts in ChatGPT?
10 by nabi_nafio | 5 comments on Hacker News.
I use ChatGPT regularly for a lot of different tasks. For example, coding, health Q&A, and summarizing docs. The different prompts stack up in the sidebar which becomes very difficult to manage. For example, I frequently have to refer back to a prompt that I wrote previously. But I usually give up looking for it because of the tedious scroll and search process. I was wondering if there is an easier way. How do you manage your prompts in ChatGPT?
