Asynchronous Error Handling Is Hard
11 by hedgehog | 1 comments on Hacker News.
Monday, June 30, 2025
Sunday, June 29, 2025
New top story on Hacker News: Show HN: A tool to benchmark LLM APIs (OpenAI, Claude, local/self-hosted)
Show HN: A tool to benchmark LLM APIs (OpenAI, Claude, local/self-hosted)
3 by mrqjr | 1 comments on Hacker News.
I recently built a small open-source tool to benchmark different LLM API endpoints — including OpenAI, Claude, and self-hosted models (like llama.cpp). It runs a configurable number of test requests and reports two key metrics: • First-token latency (ms): How long it takes for the first token to appear • Output speed (tokens/sec): Overall output fluency Demo: https://llmapitest.com/ Code: https://ift.tt/Qf0qSk2 The goal is to provide a simple, visual, and reproducible way to evaluate performance across different LLM providers, including the growing number of third-party “proxy” or “cheap LLM API” services. It supports: • OpenAI-compatible APIs (official + proxies) • Claude (via Anthropic) • Local endpoints (custom/self-hosted) You can also self-host it with docker-compose. Config is clean, adding a new provider only requires a simple plugin-style addition. Would love feedback, PRs, or even test reports from APIs you’re using. Especially interested in how some lesser-known services compare.
3 by mrqjr | 1 comments on Hacker News.
I recently built a small open-source tool to benchmark different LLM API endpoints — including OpenAI, Claude, and self-hosted models (like llama.cpp). It runs a configurable number of test requests and reports two key metrics: • First-token latency (ms): How long it takes for the first token to appear • Output speed (tokens/sec): Overall output fluency Demo: https://llmapitest.com/ Code: https://ift.tt/Qf0qSk2 The goal is to provide a simple, visual, and reproducible way to evaluate performance across different LLM providers, including the growing number of third-party “proxy” or “cheap LLM API” services. It supports: • OpenAI-compatible APIs (official + proxies) • Claude (via Anthropic) • Local endpoints (custom/self-hosted) You can also self-host it with docker-compose. Config is clean, adding a new provider only requires a simple plugin-style addition. Would love feedback, PRs, or even test reports from APIs you’re using. Especially interested in how some lesser-known services compare.
Saturday, June 28, 2025
Friday, June 27, 2025
Thursday, June 26, 2025
Wednesday, June 25, 2025
Tuesday, June 24, 2025
Monday, June 23, 2025
Sunday, June 22, 2025
Saturday, June 21, 2025
New top story on Hacker News: Show HN: OSAI-Browser – A P2P Browser for Web3 and HTML Games
Show HN: OSAI-Browser – A P2P Browser for Web3 and HTML Games
5 by EvoSync | 2 comments on Hacker News.
https://ift.tt/uzgjLS2 OSAI Browser is an peer-to-peer (P2P) browser currently in active development. My goal is to redefine how we interact with the web, focusing on decentralization and cutting-edge capabilities for web content. As a core future functionality, I envision distributed computing for high-quality web games and IoT applications, leveraging the P2P architecture to achieve impressive image fidelity and performance. Imagine games that harness the collective power of connected users, or IoT devices seamlessly interacting through a decentralized browser! Currently, the browser successfully allows users to drag-and-drop ZIP files directly to install and run web games. This demonstrates the practical application of its unique P2P distribution model. Crucially, both the server and client functionalities are already up and running, providing a robust foundation for the P2P network. We also plan to work with WebAssembly (WASM) and various game engines to expand its capabilities. Please note that OSAI-browser is still an early stage project and a work in progress. Your constructive feedback and suggestions are highly appreciated as we continue to develop and refine it. for coder: https://ift.tt/SNZlnOB It's a rough code though
5 by EvoSync | 2 comments on Hacker News.
https://ift.tt/uzgjLS2 OSAI Browser is an peer-to-peer (P2P) browser currently in active development. My goal is to redefine how we interact with the web, focusing on decentralization and cutting-edge capabilities for web content. As a core future functionality, I envision distributed computing for high-quality web games and IoT applications, leveraging the P2P architecture to achieve impressive image fidelity and performance. Imagine games that harness the collective power of connected users, or IoT devices seamlessly interacting through a decentralized browser! Currently, the browser successfully allows users to drag-and-drop ZIP files directly to install and run web games. This demonstrates the practical application of its unique P2P distribution model. Crucially, both the server and client functionalities are already up and running, providing a robust foundation for the P2P network. We also plan to work with WebAssembly (WASM) and various game engines to expand its capabilities. Please note that OSAI-browser is still an early stage project and a work in progress. Your constructive feedback and suggestions are highly appreciated as we continue to develop and refine it. for coder: https://ift.tt/SNZlnOB It's a rough code though
Friday, June 20, 2025
New top story on Hacker News: Show HN: SecureBuild – Zero-CVE Images That Pay OSS Projects
Show HN: SecureBuild – Zero-CVE Images That Pay OSS Projects
18 by grantlmiller | 7 comments on Hacker News.
We're launching SecureBuild: https://securebuild.com — a new way for open source projects and maintainers to earn revenue by partnering with and endorsing our Zero-CVE container images of their project. We’ve spent the last decade at Replicated ( https://ift.tt/1wbnhDj ) helping commercial and open source software vendors securely distribute their apps to enterprise environments. During that time, we saw firsthand how hard it is for maintainers to fund their work, and how increasingly demanding enterprises have become when it comes to demonstrable security and scanning. SecureBuild is our attempt to bridge that gap. Built on top of Wolfi ( https://ift.tt/2QUdt4R ), we provide Zero-CVE container images with tight SLAs, full SBOMs, etc, but we route 70% of direct subscription revenue back to the open source projects that create them. We’re especially interested in partnering with open source maintainers who want to make their projects more secure and sustainable without changing licenses. We handle builds, hosting, sales, patching, and customer delivery. I'm Grant ( https://ift.tt/ViY5IX0 ), co-founder of Replicated & co-creator of SecureBuild, working with my co-founder Marc Campbell ( https://ift.tt/NtiZk6R ). We hope this can be part of a broader push toward a more secure, economically sustainable future for open source. Happy to answer questions and share more details!
18 by grantlmiller | 7 comments on Hacker News.
We're launching SecureBuild: https://securebuild.com — a new way for open source projects and maintainers to earn revenue by partnering with and endorsing our Zero-CVE container images of their project. We’ve spent the last decade at Replicated ( https://ift.tt/1wbnhDj ) helping commercial and open source software vendors securely distribute their apps to enterprise environments. During that time, we saw firsthand how hard it is for maintainers to fund their work, and how increasingly demanding enterprises have become when it comes to demonstrable security and scanning. SecureBuild is our attempt to bridge that gap. Built on top of Wolfi ( https://ift.tt/2QUdt4R ), we provide Zero-CVE container images with tight SLAs, full SBOMs, etc, but we route 70% of direct subscription revenue back to the open source projects that create them. We’re especially interested in partnering with open source maintainers who want to make their projects more secure and sustainable without changing licenses. We handle builds, hosting, sales, patching, and customer delivery. I'm Grant ( https://ift.tt/ViY5IX0 ), co-founder of Replicated & co-creator of SecureBuild, working with my co-founder Marc Campbell ( https://ift.tt/NtiZk6R ). We hope this can be part of a broader push toward a more secure, economically sustainable future for open source. Happy to answer questions and share more details!
Thursday, June 19, 2025
New top story on Hacker News: Show HN: EnrichMCP – A Python ORM for Agents
Show HN: EnrichMCP – A Python ORM for Agents
21 by bloppe | 0 comments on Hacker News.
I've been working with the Featureform team on their new open-source project, [EnrichMCP][1], a Python ORM framework that helps AI agents understand and interact with your data in a structured, semantic way. EnrichMCP is built on top of [MCP][2] and acts like an ORM, but for agents instead of humans. You define your data model using SQLAlchemy, APIs, or custom logic, and EnrichMCP turns it into a type-safe, introspectable interface that agents can discover, traverse, and invoke. It auto-generates tools from your models, validates all I/O with Pydantic, handles relationships, and supports schema discovery. Agents can go from user → orders → product naturally, just like a developer navigating an ORM. We use this internally to let agents query production systems, call APIs, apply business logic, and even integrate ML models. It works out of the box with SQLAlchemy and is easy to extend to any data source. If you're building agentic systems or anything AI-native, I'd love your feedback. Code and docs are here: https://ift.tt/krfJAq4 . Happy to answer any questions. [1]: https://ift.tt/krfJAq4 [2]: https://ift.tt/DNv3enU
21 by bloppe | 0 comments on Hacker News.
I've been working with the Featureform team on their new open-source project, [EnrichMCP][1], a Python ORM framework that helps AI agents understand and interact with your data in a structured, semantic way. EnrichMCP is built on top of [MCP][2] and acts like an ORM, but for agents instead of humans. You define your data model using SQLAlchemy, APIs, or custom logic, and EnrichMCP turns it into a type-safe, introspectable interface that agents can discover, traverse, and invoke. It auto-generates tools from your models, validates all I/O with Pydantic, handles relationships, and supports schema discovery. Agents can go from user → orders → product naturally, just like a developer navigating an ORM. We use this internally to let agents query production systems, call APIs, apply business logic, and even integrate ML models. It works out of the box with SQLAlchemy and is easy to extend to any data source. If you're building agentic systems or anything AI-native, I'd love your feedback. Code and docs are here: https://ift.tt/krfJAq4 . Happy to answer any questions. [1]: https://ift.tt/krfJAq4 [2]: https://ift.tt/DNv3enU
Wednesday, June 18, 2025
New top story on Hacker News: Show HN: Free local security checks for AI coding in VSCode, Cursor and Windsurf
Show HN: Free local security checks for AI coding in VSCode, Cursor and Windsurf
11 by jaimefjorge | 5 comments on Hacker News.
Hi HN! We just launched Codacy Guardrails, an IDE extension with a CLI for code analysis and MCP server that enforces security & quality rules on AI-generated code in real-time. It hooks into AI coding assistants (like VS Code Agent Mode, Cursor, Windsurf), silently scanning and fixing AI-suggested code that has vulnerabilities or violates your coding standards, while the code it’s being generated. We built this because coding agents can be a double-edged sword. They do boost productivity, but can easily introduce insecure or non-compliant code. One recent research team at NYU found that 40% of Copilot’s outputs were buggy or exploitable [1]. Other surveys mention that people are spending more time debugging AI-generated code [2]. That's why we created “guardrails” to catch security problems early. Codacy Guardrails uses a collection of open-source static analyzers (like Semgrep and Trivy) to scan the AI’s output against 2000+ rules. We currently support JavaScript/TypeScript, Python, and Java, focusing on things like OWASP Top 10 vulns, hardcoded secrets, dependency checks, code complexity and styling violations, and you can customize the rules to match your project’s needs. We're not using any AI models, it's “classic” static code analysis working alongside your AI assistant. Here’s a quick demo: https://youtu.be/pB02u0ntQpM The extension is free for all developers. (We do have paid plans for teams to apply rules centrally, but that’s not needed to use the extension and local code analysis with agents.) Setup is pretty straightforward: Install the extension and enable Codacy’s CLI and MCP Server from the sidebar. We’re eager to hear what the HN community thinks! Does this approach sound useful in your AI coding workflow? Have you encountered security issues from AI-generated code? We hope Codacy Guardrails can make AI-assisted development a bit safer and more trustworthy. Thanks for reading! Get extension: https://ift.tt/9NPSfF5 Docs: https://ift.tt/pFMPGSg... Sources [1]: NYU Research: https://ift.tt/x0bYnVw... [2]: https://ift.tt/a8k3Fs0...
11 by jaimefjorge | 5 comments on Hacker News.
Hi HN! We just launched Codacy Guardrails, an IDE extension with a CLI for code analysis and MCP server that enforces security & quality rules on AI-generated code in real-time. It hooks into AI coding assistants (like VS Code Agent Mode, Cursor, Windsurf), silently scanning and fixing AI-suggested code that has vulnerabilities or violates your coding standards, while the code it’s being generated. We built this because coding agents can be a double-edged sword. They do boost productivity, but can easily introduce insecure or non-compliant code. One recent research team at NYU found that 40% of Copilot’s outputs were buggy or exploitable [1]. Other surveys mention that people are spending more time debugging AI-generated code [2]. That's why we created “guardrails” to catch security problems early. Codacy Guardrails uses a collection of open-source static analyzers (like Semgrep and Trivy) to scan the AI’s output against 2000+ rules. We currently support JavaScript/TypeScript, Python, and Java, focusing on things like OWASP Top 10 vulns, hardcoded secrets, dependency checks, code complexity and styling violations, and you can customize the rules to match your project’s needs. We're not using any AI models, it's “classic” static code analysis working alongside your AI assistant. Here’s a quick demo: https://youtu.be/pB02u0ntQpM The extension is free for all developers. (We do have paid plans for teams to apply rules centrally, but that’s not needed to use the extension and local code analysis with agents.) Setup is pretty straightforward: Install the extension and enable Codacy’s CLI and MCP Server from the sidebar. We’re eager to hear what the HN community thinks! Does this approach sound useful in your AI coding workflow? Have you encountered security issues from AI-generated code? We hope Codacy Guardrails can make AI-assisted development a bit safer and more trustworthy. Thanks for reading! Get extension: https://ift.tt/9NPSfF5 Docs: https://ift.tt/pFMPGSg... Sources [1]: NYU Research: https://ift.tt/x0bYnVw... [2]: https://ift.tt/a8k3Fs0...
Tuesday, June 17, 2025
Monday, June 16, 2025
New top story on Hacker News: Show HN: Trieve CLI – Terminal-Based LLM Agent Loop with Search Tool for PDFs
Show HN: Trieve CLI – Terminal-Based LLM Agent Loop with Search Tool for PDFs
16 by skeptrune | 0 comments on Hacker News.
Hi HN, I built a CLI for uploading documents and querying them with an LLM agent that uses search tools rather than stuffing everything into the context window. I recorded a demo using the CrossFit 2025 rulebook that shows how this approach compares to traditional RAG and direct context injection[1]. The core insight is that LLMs running in loops with tool access are unreasonably effective at this kind of knowledge retrieval task[2]. Instead of hoping the right chunks make it into your context, the agent can iteratively search, refine queries, and reason about what it finds. The CLI handles the full workflow: ```bash trieve upload ./document.pdf trieve ask "What are the key findings?" ``` You can customize the RAG behavior, check upload status, and the responses stream back with expandable source references. I really enjoy having this workflow available in the terminal and I'm curious if others find this paradigm as compelling as I do. Considering adding more commands and customization options if there's interest. The tool is free for up to 1k document chunks. Source code is on GitHub[3] and available via npm[4]. Would love any feedback on the approach or CLI design! [1]: https://www.youtube.com/watch?v=SAV-esDsRUk [2]: https://ift.tt/ZKCFfyY [3]: https://ift.tt/uUJ4k0M... [4]: https://ift.tt/jYIOsxL
16 by skeptrune | 0 comments on Hacker News.
Hi HN, I built a CLI for uploading documents and querying them with an LLM agent that uses search tools rather than stuffing everything into the context window. I recorded a demo using the CrossFit 2025 rulebook that shows how this approach compares to traditional RAG and direct context injection[1]. The core insight is that LLMs running in loops with tool access are unreasonably effective at this kind of knowledge retrieval task[2]. Instead of hoping the right chunks make it into your context, the agent can iteratively search, refine queries, and reason about what it finds. The CLI handles the full workflow: ```bash trieve upload ./document.pdf trieve ask "What are the key findings?" ``` You can customize the RAG behavior, check upload status, and the responses stream back with expandable source references. I really enjoy having this workflow available in the terminal and I'm curious if others find this paradigm as compelling as I do. Considering adding more commands and customization options if there's interest. The tool is free for up to 1k document chunks. Source code is on GitHub[3] and available via npm[4]. Would love any feedback on the approach or CLI design! [1]: https://www.youtube.com/watch?v=SAV-esDsRUk [2]: https://ift.tt/ZKCFfyY [3]: https://ift.tt/uUJ4k0M... [4]: https://ift.tt/jYIOsxL
Sunday, June 15, 2025
Saturday, June 14, 2025
Friday, June 13, 2025
Thursday, June 12, 2025
New top story on Hacker News: Show HN: ChatToSTL – AI text-to-CAD for 3D printing
Show HN: ChatToSTL – AI text-to-CAD for 3D printing
4 by flowful | 0 comments on Hacker News.
Hey HN, I'm a beginner at CAD so I built an app that does it for me ;) Describe a part and ChatToSTL writes the OpenSCAD code, shows a live render with size sliders, then exports the STL/3MF file. Because the output is parametric, it's easy to modify (unlike mesh models like Shap-E or DreamFusion). Try it (needs your own OpenAI key): https://ift.tt/vUM0BTc How it works: Text prompt → o4-mini generates OpenSCAD code → live render + sliders → refine in chat → export. Examples & Code: * Walkthrough + real prints (bowl, hook, box, door stop): https://ift.tt/QRs67qh... * 90-sec demo: https://www.youtube.com/watch?v=ZK_IDaNn1Mk * MIT repo: https://ift.tt/krZsmeX Current limitations (it's not replacing Fusion 360 anytime soon): - Simple shapes only. Even a mug can end up with a misplaced handle - Works best with CAD-style language ("extrude 5mm") - AI can't see the render, so no self-correction yet I'm particularly interested in feedback on improving the 3D generation quality: should I add vision feedback so that it can self critique? use CADQuery instead of OpenSCAD? use a different model? Thanks! Nico
4 by flowful | 0 comments on Hacker News.
Hey HN, I'm a beginner at CAD so I built an app that does it for me ;) Describe a part and ChatToSTL writes the OpenSCAD code, shows a live render with size sliders, then exports the STL/3MF file. Because the output is parametric, it's easy to modify (unlike mesh models like Shap-E or DreamFusion). Try it (needs your own OpenAI key): https://ift.tt/vUM0BTc How it works: Text prompt → o4-mini generates OpenSCAD code → live render + sliders → refine in chat → export. Examples & Code: * Walkthrough + real prints (bowl, hook, box, door stop): https://ift.tt/QRs67qh... * 90-sec demo: https://www.youtube.com/watch?v=ZK_IDaNn1Mk * MIT repo: https://ift.tt/krZsmeX Current limitations (it's not replacing Fusion 360 anytime soon): - Simple shapes only. Even a mug can end up with a misplaced handle - Works best with CAD-style language ("extrude 5mm") - AI can't see the render, so no self-correction yet I'm particularly interested in feedback on improving the 3D generation quality: should I add vision feedback so that it can self critique? use CADQuery instead of OpenSCAD? use a different model? Thanks! Nico
Wednesday, June 11, 2025
Tuesday, June 10, 2025
Monday, June 9, 2025
New top story on Hacker News: Apple introduces a universal design across platforms
Apple introduces a universal design across platforms
32 by meetpateltech | 40 comments on Hacker News.
32 by meetpateltech | 40 comments on Hacker News.
Sunday, June 8, 2025
Saturday, June 7, 2025
Friday, June 6, 2025
Thursday, June 5, 2025
New top story on Hacker News: Show HN: ClickStack – open-source Datadog alternative by ClickHouse and HyperDX
Show HN: ClickStack – open-source Datadog alternative by ClickHouse and HyperDX
30 by mikeshi42 | 5 comments on Hacker News.
Hey HN! Mike & Warren here from HyperDX (now part of ClickHouse)! We’ve been building ClickStack, an open source observability stack that helps you collect, centralize, search/viz/alert on your telemetry (logs, metrics, traces) in just a few minutes - all powered by ClickHouse (Apache2) for storage, HyperDX (MIT) for visualization and OpenTelemetry (Apache2) for ingestion. You can check out the quick start for spinning things up in the repo here: https://ift.tt/JywVbO6 ClickStack makes it really easy to instrument your application so you can go from bug reports of “my checkout didn’t go through” to a session replay of the user, backend API calls, to DB queries and infrastructure metrics related to that specific request in a single view. For those that might be migrating from Very Expensive Observability Vendor (TM) to something open source, more performant, and doesn’t require extensive culling of retention limits and sampling rates - ClickStack gives a batteries-included way of starting that migration journey. For those that aren’t familiar with ClickHouse, it’s a high performance database that has already been used by companies such as Anthropic, Cloudflare, and DoorDash to power their core observability at scale due to its flexibility, ease of use, and cost effectiveness. However, this required teams to dedicate engineers to building a custom observability stack, where it’s difficult to not only get their telemetry data easily into ClickHouse but also struggling without a native UI experience. That’s why we’re building ClickStack - we wanted to bundle an easy way to get started ingesting your telemetry data whether it’s logs & traces from Node.js or Ruby to metrics from Kubernetes or your bare metal infrastructure. Just as important we wanted our users to enjoy a visualization experience that allowed users to quickly search using a familiar lucene-like search syntax (similar to what you’d use in Google!). We recognise though, that a SQL mode is needed for the most complex of queries. We've also added high cardinality outlier analysis by charting the delta between outlier and inlier events - which we've found really helpful in narrowing down causes of regressions/anomalies in our traces as well as log patterns to condense down clusters of similar logs. We’re really excited about the roadmap ahead in terms of improving ClickStack as a product and the ClickHouse core database to improve observability. Would love to hear everyone’s feedback and what they think! Spinning up a container is pretty simple: `docker run -p 8080:8080 -p 4317:4317 -p 4318:4318 docker.hyperdx.io/hyperdx/hyperdx-all-in-one` In browser live demo (no sign ups or anything silly, it runs fully in your browser!): https://ift.tt/q0JDPKg Landing Page: https://ift.tt/DFVtkrc Github Repo: https://ift.tt/JywVbO6 Discord community: https://ift.tt/UeGcPv4 Docs: https://ift.tt/I6hFNCi...
30 by mikeshi42 | 5 comments on Hacker News.
Hey HN! Mike & Warren here from HyperDX (now part of ClickHouse)! We’ve been building ClickStack, an open source observability stack that helps you collect, centralize, search/viz/alert on your telemetry (logs, metrics, traces) in just a few minutes - all powered by ClickHouse (Apache2) for storage, HyperDX (MIT) for visualization and OpenTelemetry (Apache2) for ingestion. You can check out the quick start for spinning things up in the repo here: https://ift.tt/JywVbO6 ClickStack makes it really easy to instrument your application so you can go from bug reports of “my checkout didn’t go through” to a session replay of the user, backend API calls, to DB queries and infrastructure metrics related to that specific request in a single view. For those that might be migrating from Very Expensive Observability Vendor (TM) to something open source, more performant, and doesn’t require extensive culling of retention limits and sampling rates - ClickStack gives a batteries-included way of starting that migration journey. For those that aren’t familiar with ClickHouse, it’s a high performance database that has already been used by companies such as Anthropic, Cloudflare, and DoorDash to power their core observability at scale due to its flexibility, ease of use, and cost effectiveness. However, this required teams to dedicate engineers to building a custom observability stack, where it’s difficult to not only get their telemetry data easily into ClickHouse but also struggling without a native UI experience. That’s why we’re building ClickStack - we wanted to bundle an easy way to get started ingesting your telemetry data whether it’s logs & traces from Node.js or Ruby to metrics from Kubernetes or your bare metal infrastructure. Just as important we wanted our users to enjoy a visualization experience that allowed users to quickly search using a familiar lucene-like search syntax (similar to what you’d use in Google!). We recognise though, that a SQL mode is needed for the most complex of queries. We've also added high cardinality outlier analysis by charting the delta between outlier and inlier events - which we've found really helpful in narrowing down causes of regressions/anomalies in our traces as well as log patterns to condense down clusters of similar logs. We’re really excited about the roadmap ahead in terms of improving ClickStack as a product and the ClickHouse core database to improve observability. Would love to hear everyone’s feedback and what they think! Spinning up a container is pretty simple: `docker run -p 8080:8080 -p 4317:4317 -p 4318:4318 docker.hyperdx.io/hyperdx/hyperdx-all-in-one` In browser live demo (no sign ups or anything silly, it runs fully in your browser!): https://ift.tt/q0JDPKg Landing Page: https://ift.tt/DFVtkrc Github Repo: https://ift.tt/JywVbO6 Discord community: https://ift.tt/UeGcPv4 Docs: https://ift.tt/I6hFNCi...