Question 1

What does Ground Floor test each week?

Accepted Answer

Each Ground Floor experiment runs one real task for a regulated practice on a local, open-weight model: a SOAP note, a contract clause, a set of meeting minutes. It runs on Apple Silicon with the air-gap held. I publish what it did, the model and hardware used, and a plain verdict of viable, partial, or not yet.

Question 2

Are the failures published too?

Accepted Answer

Yes. An experiment gets a "not yet" verdict when the local model could not do the task well enough to trust, and that verdict stays up. The point is an honest record of what local AI can and cannot do for regulated work right now, not a highlight reel.

Question 3

What hardware and models do the experiments use?

Accepted Answer

The experiments run open-weight models on Apple Silicon Macs, from an M4 mini up to the two-M5-Max lab. Each write-up lists the exact model, the Mac, and the task, so you can judge whether it maps to your own setup.

Your data doesn't
leave this machine.

Can a local model turn client meeting voice memos into CRM-ready notes for an RIA?

Can a local 13B model flag risky clauses in a vendor contract for a solo attorney?

Can an $800 Mac Mini draft SOAP notes for a solo medical practice?

About the experiments

Your data doesn'tleave this machine.

Can a local model turn client meeting voice memos into CRM-ready notes for an RIA?

Can a local 13B model flag risky clauses in a vendor contract for a solo attorney?

Can an $800 Mac Mini draft SOAP notes for a solo medical practice?

About the experiments

Your data doesn't
leave this machine.