Files aren't databases
But now they act like one

Connect and query your distributed files in one place

01 ABOUT

Connect raw, distributed data sources with no migration and pipelines

Access Anywhere

Install nodes on any machine and instantly make it part of your global data lake.

Format Agnostic

Excel, CSV, JSON — it doesn't matter. If it has structure, it will be queried.

Zero Migration

Keep your files exactly where they are — Twiddl brings the query to you.

Sync in Click

Integrate every data source in one click, uniting your teams and devices in a single network.

It is easier than explaining to your manager what ETL means.

02 VALUE

Break free from rigid formats. Twiddl links any file, from any corner of your network

SERVER 1 - TOKYO
metrics.parquet
customers.json
LAPTOP - CHICAGO
sales_2023.csv
leads_q4.xlsx
IoT DEVICE - BERLIN
sensors_data.xml
device_health.avro

Drag, drop, query. That's it.

sales_2023.csv (Chicago)

customer_id JOIN

purchase_date

total_amount

region

TWIDDL
Unified Query Results
Tanaka Corp $123,456 Tokyo
Suzuki Ltd $89,012 Tokyo
Smith Partners $78,500 Chicago
customers.json (Tokyo)

customer_id JOIN

customer_name

email

company

03 HOW IT WORKS

Once your data's stored, it's queryable. No extra steps — just instant insight

01

Install Nodes

• One-click installation.
• Each machine adds to your global network.
• Node monitors your local files in place.
• Fits even for IoT devices.

02

Select Data

• Browse available files across all nodes.
• Combine any files to create your query.
• Use SQL or friendly interface.
• No knowledge required.

03

Get Results

• As soon as file added to node, instatly query it from anywhere.
• Close to real time data ready for analysis.
• Twiddl is your first step to automated data pipelines.

> Technical Notes

  • Nodes run on your machines as Docker containers with minimal footprint.
  • Specified folders are monitored for any changes in files structure.
  • You can join and query any files across all nodes using Twiddl web interface.
  • SQL or Drag and Drop interface — friendly for any level of expertise.
  • Each node processes its own batch of data locally and transmits only the results.
  • Twiddl server aggregates results from nodes and presents them in one place.
  • Supported formats: XLSX, CSV, JSON, Parquet, XML, AVRO and more.
  • The node is lightweight enough to run on IoT devices for live monitoring.
  • Deploy Twiddl server on your local network, or use our web service.

04 UPDATES

Proof we’re building: real milestones over time

Mar 01, 2026 · Latest Update

Preparation Stage: Opening Twiddl for Live Evaluation

Twiddl is moving from controlled demos to structured, open evaluation. The focus now is consistency: making the system something real teams can experience end-to-end when they’re exploring distributed data access in day-to-day operations.

The near-term vision is simple and customer-driven: a publicly reachable central workspace in the cloud, connected to continuously running local nodes on Twiddl’s machines, backed by a curated dataset that looks like real operational data (files, drift, and change over time). The objective is to make the architecture visible through usage—not explanation.

The rollout includes:

  • A production-hosted central node exposed via webapp and available continuously
  • Persistent local nodes maintaining secure outbound connections (no inbound firewall changes) located on Twiddl's machines
  • A defined demonstration dataset that reflects real operational patterns (multiple sites, evolving schemas, mixed formats)
  • A stable query flow that highlights cross-node execution and aggregation (one question → multi-location answer)

This stage also marks the beginning of live walkthroughs, interactive Q&A sessions, and structured demo calls. Instead of slides or diagrams, the focus shifts to running real queries, reviewing what data stayed local, and discussing how pilots would look inside your environment.

The goal is simple: move from isolated demos to active evaluation conversations, and from conversations to early pilot deployments.

Sep 08, 2025 · Update 01

The First Obsession: “Why can’t I just query all of this?”

The friction is practical: CSVs, JSON exports, XLSX files, and logs are valuable immediately, yet most “analysis” still begins with prep work—one-off scripts, temporary pipelines, and manual cleanup. The question is about making existing files queryable as-is, without engineering overhead.

The first experiment is straightforward: treat mixed formats as one SQL-queryable surface using an embedded engine. DuckDB is the obvious candidate because it’s fast, local-first, and unusually capable at reading real-world files—even when those files are messy.

That mess is the point: schema drift, inconsistent headers, semi-broken JSON, and timestamp inconsistencies show up immediately, and they clarify the direction. Twiddl should be SQL-first and file-native, and it should absorb inconsistency so that querying stays simple.

  • Local SQL execution is efficient; in real organizations, the bigger problem is that files are spread across teams, tools, and locations.
  • The goal isn’t “one format to rule them all,” but one interface to ask questions across what you already have.
  • The real cost is the delay between “data exists” and “answers are accessible”—and the people-time spent bridging that gap.
Oct 23, 2025 · Update 02

The Remote File That Forced a New Architecture

During early validation, one dataset we needed wasn’t on this machine—it was on another computer in another location. Copying it over would solve the immediate problem, but it reveals the structural reality: in most organizations, data is distributed by default, and “just move it” creates version confusion, delays, and unnecessary exposure.

Rather than optimizing copying, Twiddl leans into the constraint: data stays where it is, queries run where the data lives, and results are coordinated centrally. That reframing turns a blocked test case into the guiding principle.

  • Keep data where it is, instead of centralizing by default
  • Execute locally, close to files and existing permissions
  • Coordinate centrally, routing and aggregating results across locations

For distributed teams—plants, hospitals, regional offices—this is the difference between “send me the export” and “ask once, get a unified answer.” The local-node + central-node model now feels like the simplest workable structure for a distributed reality.

Nov 12, 2025 · Update 03

First Working Prototype: Two Machines, One End-to-End Query

The milestone is intentionally narrow: make one question travel to another machine, execute where the files live, and return results end-to-end. The first version stayed sterile on purpose: two machines, the same local network, similar OS assumptions, and a limited set of file types to close the loop.

What makes this step meaningful is that it’s not just remote execution—it’s coordination. A real distributed setup needs a shared view of what data exists where, whether locations are reachable, and how results come back reliably.

  • A local node watches a directory and infers metadata/schemas using DuckDB
  • The local node maintains a persistent, outbound connection to the central node (simpler networking for most environments)
  • The central node tracks metadata + heartbeats, then dispatches query requests
  • The local node executes subqueries and streams rows back in chunks for fast feedback

Customer meaning: a team can ask a question from one place and get answers from multiple places—without building pipelines first. If this loop stays stable, it becomes the foundation for scaling beyond controlled tests into real multi-site environments.

Dec 21, 2025 · Update 04

The “Cafe Demo” Moment: Remote Central + Local Laptop Node

The prototype is leaving the LAN. The system is now being tested in a setting that doesn’t cooperate: a remote central node, a laptop local node, and public Wi-Fi. This is where operational behavior becomes visible.

Distributed teams don’t live on perfect networks. So trust signals matter: clear online/offline/stale states, safe automatic reconnection, and an experience that doesn’t require babysitting.

  • Connectivity issues need first-class handling, not best-effort retries
  • Node state must be explicit and visible (online/offline/stale)
  • Reconnect must be automatic and safe, without manual intervention

This test sharpened the product promise: when conditions are imperfect, the workflow still needs to be simple—files stay local, the central view stays clear, queries run where data lives, and results come back when connectivity allows.

Jan 25, 2026 · Update 05

Launcher, macOS Testing, and a Security Baseline

The focus right now is reducing operational friction and setting a clear security baseline suitable for evaluation and early pilots. The emphasis is reliable startup, cross-environment validation, and a trust model that’s pragmatic for real teams.

  • Launcher and run flow: a simple launcher starts the local node container with a mounted watch directory (low setup overhead)
  • Cross-environment testing (macOS): launcher and runtime assumptions are being validated across Mac setups (common in distributed teams)
  • Security and message validation: nodes prove identity and messages are verified to support safer evaluation

On the data path side, local nodes stream query results back in compressed chunks, and the central node aggregates and paginates results with CSV export. The runtime now feels stable enough to put in front of teams—while keeping a conservative default posture: secure outbound connectivity, explicit node status, and verifiable messaging.

05 JOIN WAITLIST

Don't forget why you're here. Join the waitlist

We're currently in the Proof of Concept stage and seeking our first partners. Contact us to get early access and start experiencing the value of Twiddl as soon as possible.

[email protected]