Structure Output

NodeRAG relies on structure decoding and requires a long and stable structural output. Current test results show that apart from OpenAI, which can stably output, the JSON format of Gemini and DeepSeek still faces challenges.

Structured Output Challenges in NodeRAG

NodeRAG relies heavily on breaking down long-form text into structured prompts for further processing. Specifically, it depends on the ability to consistently generate structured JSON outputs from longer text inputs. On the one hand, we need the system to maintain proper formatting even in complex, extended tasks. To achieve this, our project initially leaned on OpenAI’s structured output feature (docs here). With this, the model reliably broke down text blocks into heterogeneous nodes, which made downstream processing smooth and efficient.

However, as we looked to expand to other models, we hit some roadblocks. For example, OpenAI’s structural output failures (i.e., when the output didn’t follow the expected JSON format) occurred less than 1% of the time—pretty solid. But when we tried Gemini 2.0 Flash, this failure rate jumped to about 20%. Even worse, with DeepSeek’s official JSON output, nearly 60% of the API responses ignored the prescribed format, making error handling virtually impossible.

Because of these issues, in version v0.1.0, we only supported OpenAI and Gemini for structured output tasks.

OpenAI GPT-4o-mini

Figure 1. OpenAI GPT-4o-mini — highly reliable JSON output

Gemini 2.0 Flash

Figure 2. Gemini 2.0 Flash — acceptable format, but lower content quality

That said, even though Gemini managed to stick to the format more often than DeepSeek, we noticed issues with the quality of its outputs—such as repeated phrases, extra spaces, and generally lower fidelity content compared to OpenAI’s GPT-4o-mini.

Our Recommendation

So far, we recommend using OpenAI’s models for NodeRAG, especially for tasks that require consistent structured outputs. We’ll continue evaluating other models for their structured output capabilities to see if they can eventually fit into the NodeRAG pipeline.


Looking Ahead

One future direction we’re excited about: fine-tuning a lightweight model specifically for the NodeRAG text-to-node decomposition step. This task is essentially a blend of summarization and paraphrasing, so it’s relatively straightforward. The challenge lies in maintaining stable, structured outputs over long text inputs.

Once the text has been broken down into structured nodes, we can use more powerful models to extract higher-level insights and viewpoint summarization from the graph.

Last modified April 5, 2025: update reproduce (f23a25c)