Friday, December 20, 2024

New top story on Hacker News: Ask HN: How do you work with people who are "not quite smart"?

Ask HN: How do you work with people who are "not quite smart"?
3 by charles_f | 4 comments on Hacker News.
Hey, This is a touchy subject, and that might be a lack of awareness or empathy from my part. But trust that it comes from a genuine willingness of making things better for everyone. We all work with people who we find "not as good", have different ways or work ethics. After being told for decades that this is usually a problem with communication or point of view, I had somewhat internalized the idea. And it is often true, but what I've realized as of late, is that there's a category of people who are not just working a different way, but are - to put it bluntly - plainly not smart. What I'm talking about is people below average when it comes to understanding concepts, or conceptualizing altogether. Their intuition is always twisted and wrong. Completely lack critical feedback. Work needs to be decomposed for them in extremely precise steps if you want anything to happen. The type of person where you know anything assigned to them will be badly done. When you open a document or code written by them, you do it with the anticipation that it's gonna be bad in novel ways. And despite all of your efforts to try and coach them, seem to make no progress (where the same coach/coaching on others works). And I know there might be other causes for that, maybe something that happens in their life, lack of interest in the task or motivation overall. But I think I can make a clear distinction between someone who doesn't give a crap, and someone who does but is not equipped to achieve the task at hand. Some of them are direct colleagues whom I can provide feedback about - but then what do you tell them/their manager? "It's very hard to explain you concepts, you should work on conceptualization"? Some of them are transverse. I don't even know what to do about these people. An "experienced" PM who systematically gives you hot garbage, apart from pointing the lack of research and the absence of evidence-based reasoning, how do you tell someone "what you gave me is plain dumb, and that's not just a difference of opinion in how we should approach this product/feature, it's inconsistent and pure shit"? I'm not in a management position, I can't "just fire them". Plus I don't want to go and tell them "you're just not equipped for your job you should find another". It's quite disheartening, does anyone have techniques they've used? Am I just missing something?

New top story on Hacker News: We're about to fly a spacecraft into the Sun for the first time

We're about to fly a spacecraft into the Sun for the first time
23 by pseudolus | 2 comments on Hacker News.


New top story on Hacker News: OpenAI O3 breakthrough high score on ARC-AGI-PUB

OpenAI O3 breakthrough high score on ARC-AGI-PUB
78 by maurycy | 15 comments on Hacker News.


New top story on Hacker News: Grayjay Desktop App

Grayjay Desktop App
53 by pierrelf | 22 comments on Hacker News.


Monday, December 16, 2024

New top story on Hacker News: Show HN: NCompass Technologies – yet another AI Inference API, but hear us out

Show HN: NCompass Technologies – yet another AI Inference API, but hear us out
3 by adiraja | 5 comments on Hacker News.
Hello HackerNews! I’m excited to share what we’ve been working on at nCompass Technologies: an AI inference platform that gives you a scalable and reliable API to access any open-source AI model — with no rate limits. We don't have rate limits as optimizations we made to our AI model serving software enable us to support a high number of concurrent requests without degrading quality of service for you as a user. If you’re thinking, well aren’t there a bunch of these already? So were we when we started nCompass. When using other APIs, we found that they weren’t reliable enough to be able to use open source models in production environments. To resolve this, we're building an AI inference engine that enable you, as an end user, to reliably use open source models in production. Underlying this API, we’re building optimizations at the hosting, scheduling and kernel levels with the single goal of minimizing the number of GPUs required to maximize the number of concurrent requests you can serve, without degrading quality of service. We’re still building a lot of our optimizations, but we’ve released what we have so far via our API. Compared to vLLM, we currently keep time-to-first-token (TTFT) 2-4x lower than vLLM at the equivalent concurrent request rate. You can check out a demo of our API here: https://ift.tt/WeBS91Q As a result of the optimizations we’ve rolled out so far, we’re releasing a few unique features on our API: 1. Rate-Limits: we don’t have any Most other API’s out there have strict rate limits and can be rather unreliable. We don’t want API’s for open source models to remain as a solution for prototypes only. We want people to use these APIs like they do OpenAI’s or Anthropic’s and actually make production grade products on top of open source models. 2. Underserved models: we have them There are a ton of models out there, but not all of them are readily available for people to use if they don’t have access to GPUs. We envision our API becoming a system where anyone can launch any custom model of their choice with minimal cold starts and run the model as a simple API call. Our cold starts for any 8B or 70B model are only 40s and we’ll keep improving this. Towards this goal, we already have models like `ai4bharat/hercule-hi` hosted on our API to support non-english language use cases and models like `Qwen/QwQ-32B-Preview` to support reasoning based use cases. You can find the other models that we host here: https://ift.tt/pWokNiS. We’d love for you to try out our API by following the steps here: https://ift.tt/0g8aAoV . We provide $100 of free credit on sign up to run models, and like we said, go crazy with your requests, we’d love to see if you can break our system :) We’re still actively building out features and optimizations and your input can help shape the future of nCompass. If you have thoughts on our platform or want us to host a specific model, let us know at hello@ncompass.tech. Happy Hacking!

Friday, December 13, 2024

New top story on Hacker News: MarkItDown: Python tool for converting files and office documents to Markdown

MarkItDown: Python tool for converting files and office documents to Markdown
2 by Handy-Man | 0 comments on Hacker News.


New top story on Hacker News: Garbage Collected Smart Pointers in Rust via Concurrent Cycle Collection

Garbage Collected Smart Pointers in Rust via Concurrent Cycle Collection
23 by maplant | 1 comments on Hacker News.


New top story on Hacker News: People who are good at reading have different brains

People who are good at reading have different brains
22 by pseudolus | 3 comments on Hacker News.


New top story on Hacker News: Show HN: I made the slowest, most expensive GPT

Show HN: I made the slowest, most expensive GPT
23 by wluk | 13 comments on Hacker News.
This is another one of my automate-my-life projects - I'm constantly asking the same question to different AIs since there's always the hope of getting a better answer somewhere else. Maybe ChatGPT's answer is too short, so I ask Perplexity. But I realize that's hallucinated, so I try Gemini. That answer sounds right, but I cross-reference with Claude just to make sure. This doesn't really apply to math/coding (where o1 or Gemini can probably one-shot an excellent response), but more to online search, where information is more fluid and there's no "right" search engine + text restructuring + model combination every time. Even o1 doesn't have online search, so it's obviously a hard problem to solve. An example is something like "best ski resorts in the US", which will get a different response from every GPT, but most of their rankings won't reflect actual skiers' consensus - say, on Reddit https://ift.tt/jm8DBXF... - because there's so many opinions floating around, a one-shot RAG search + LLM isn't going to have enough context to find how everyone thinks. And obviously, offline GPTs like o1 and Sonnet/Haiku aren't going to have the latest updates if a resort closes for example. So I’ve spent the last few months experimenting with a new project that's basically the most expensive GPT I’ll ever run. It runs search queries through ChatGPT, Claude, Grok, Perplexity, Gemini, etc., then aggregates the responses. For added financial tragedy, in-between it also uses multiple embedding models and performs iterative RAG searches through different search engines. This all functions as sort of like one giant AI brain. So I pay for every search, then every embedding, then every intermediary LLM input/output, then the final LLM input/output. On average it costs about 10 to 30 cents per search. It's also extremely slow. https://ithy.com I know that sounds absurdly overkill, but that’s kind of the point. The goal is to get the most accurate and comprehensive answer possible, because it's been vetted by a bunch of different AIs, each sourcing from different buckets of websites. Context limits today are just large enough that this type of search and cross-model iteration is possible, where we can determine the "overlap" between a diverse set of text to determine some sort of consensus. The idea is to get online answers that aren't attainable from any single AI. If you end up trying this out, I'd recommend comparing Ithy's output against the other GPTs to see the difference. It's going to cost me a fortune to run this project (I'll probably keep it online for a month or two), but I see it as an exploration of what’s possible with today’s model APIs, rather than something that’s immediately practical. Think of it as an online o1 (without the $200/month price tag, though I'm offering a $29/month Pro plan to help subsidize). If nothing else, it’s a fun (and pricey) thought experiment.

Thursday, December 12, 2024

New top story on Hacker News: Show HN: Gentrace – connect to your LLM app code and run/eval it from a UI

Show HN: Gentrace – connect to your LLM app code and run/eval it from a UI
7 by dsaffy | 0 comments on Hacker News.
Hey HN - Doug from Gentrace here. We originally launched via Show HN in August of 2023 as evaluation and observability for generative AI: https://ift.tt/wuPqS7Q Since then, everyone from the model providers to LLM ops companies built a prompt playground. We had one too, until we realized this was totally the wrong approach: - It's not connected to your application code - They don't support all models - You have to rebuild evals for just this one prompt (can't use your end-to-end evals) In other words, it was a ton of work and time to use these to actually make your app better. So, we built a new experience and are relaunching around this idea: Gentrace is a collaborative LLM app testing and experimentation platform that brings together engineers, PMs, subject matter experts, and more to run and test your actual end-to-end app. To do this, use our SDK to: - connect your app to Gentrace as a live runner over websocket (local) / via webhook (staging, prod) - wrap your parameters (eg prompt, model, top-k) so they become tunable knobs in the front end - edit the parameters and then run / evaluate the actual app code with datasets and evals in Gentrace We think it's great for tuning retrieval systems, upgrading models, and iterating on prompts. It's free to trial. Would love to hear your feedback / what you think!

Tuesday, December 10, 2024

New top story on Hacker News: Ask HN: Those making $500/month on side projects in 2024 – Show and tell

Ask HN: Those making $500/month on side projects in 2024 – Show and tell
87 by cvbox | 72 comments on Hacker News.
It's the time of the year again, so I'd be interested hear what new (and old) ideas have come up. Previously asked on: 2023 → https://ift.tt/AVPiuM7 2022 → https://ift.tt/EG1Wk09 2021 → https://ift.tt/MR4ym5Z 2020 → https://ift.tt/9hLDFBO 2019 → https://ift.tt/u0mQAlr 2018 → https://ift.tt/wIWJPAH 2017 → https://ift.tt/r1VXNdu

Wednesday, December 4, 2024

New top story on Hacker News: Show HN: I combined spaced repetition with emails so you can remember anything

Show HN: I combined spaced repetition with emails so you can remember anything
15 by iskrataa | 3 comments on Hacker News.
Hey HN, I am a student shipping apps in my free time. This is my 4th for the year! Non-fic books and podcasts have been part of my life for years now but I always struggled with remembering what I’ve read or listened to. I wanted it to stick even after years. My notes list grew large but I never really revisited them. That’s why I created GinkgoNotes. You can enter notes you want to recall and leave it to the app to create a personalised (based on spaced repetition) email schedule. That means you’ll get your notes emailed to you a couple of times exactly when you should read them again (based on Ebbinghaus's Forgetting Curve) so it’s certain that you’ll remember them. I hope this will be helpful as it was for me. Would love some feedback! Iskren

Sunday, December 1, 2024

New top story on Hacker News: Show HN: Vicinity – Fast, Lightweight Nearest Neighbors with Flexible Back Ends

Show HN: Vicinity – Fast, Lightweight Nearest Neighbors with Flexible Back Ends
12 by Pringled | 0 comments on Hacker News.
We’ve just open-sourced Vicinity, a lightweight approximate nearest neighbors (ANN) search package that allows for fast experimentation and comparison of a larger number of well known algorithms. Main features: - Lightweight: the base package only uses Numpy - Unified interface: use any of the supported algorithms and backends with a single interface: HNSW, Annoy, FAISS, and many more algorithms and libraries are supported - Easy evaluation: evaluate the performance of your backend with a simple function to measure queries per second vs recall - Serialization: save and load your index for persistence After working with a large number of ANN libraries over the years, we found it increasingly cumbersome to learn the interface, features, quirks, and limitations of every library. After writing custom evaluation code to measure the speed and performance for the 100th time to compare libraries, we decided to build this as a way to easily use a large number of algorithms and libraries with a unified, simple interface that allows for quick comparison and evaluation. We are curious to hear your feedback! Are there any algorithms that are missing that you use? Any extra evaluation metrics that are useful?

New top story on Hacker News: Frontier supercomputer runs the largest astrophysical simulation of universe

Frontier supercomputer runs the largest astrophysical simulation of universe
3 by belter | 0 comments on Hacker News.


New top story on Hacker News: Steam Deck hits 17,000 games playable and verified

Steam Deck hits 17,000 games playable and verified
17 by WhyNotHugo | 2 comments on Hacker News.


New top story on Hacker News: December Adventure (Advent of Code alternative)

December Adventure (Advent of Code alternative)
5 by triyambakam | 0 comments on Hacker News.