Let's continue our journey into LLMs and Gemini! In the previous article, we learned

  • how LLMs generate text, what are tokens, what configuration parameters like temperature, topP and so on mean
  • how to create a Google Cloud project, get a Gemini API key, and use the SDK to make text generation requests
  • how to leverage the different models for different tasks

Note: If you haven't read the previous article, but are confident of the knowledge you already have in regards to the topics mentioned above, feel free to proceed reading this one. Otherwise I'd suggest to first read the previous one here

This time, let's move forward and build even more complicated things, while still requiring only marginally more knowledge outside of what a typical Angular developer will possess.

Our goals

After being able to actually generate text in response to a prompt, we might be tempted to jump on and try to "build a chatbot". However, as exciting as it is, one of our goals with these articles will be to show how much more LLMs are actually capable of, rather than simply being "engines for chatbots".

To do this, we will, in depth, learn about two important concepts: structured outputs and function calling (also known as tool use). With these tools, we can build applications that can make AI-powered decisions affecting the UI and UX of our applications, and lay the foundation for our future explorations of a powerful frontend + LLM approach known as generative UI.

Note: throughout this and other articles of this series, slightly older models like Gemini Flash 2.0 Flash Lite are used, with the sole goal of reducing costs for readers/learners. For better results it is recommended to use the latest models available in the Gemini family, however, of course, taking into consideration the cost implications.

Let's start!

Structured outputs

As always, I believe it is best to start with a task at hand, and then see how the tools we are going to explore will fit into solving that task. Let's build a writing assistant, that will help writers come up with better wording for certain paragraphs.

First of all, let us build a simple Angular component, which will present the user with two textarea inputs, and a button. The user will input two versions of the same paragraph, and the AI will help them pick the better one. When the user clicks the button, we will send both paragraphs to the Gemini API, and ask it to provide an overview of which one is better and why, then display that overview in a separate box.

@Component({
  selector: 'app-writing-assistant',
  template: `
    <div>
      <h2>Writing Assistant</h2>
      <textarea [(ngModel)]="paragraph1" placeholder="Enter first paragraph"></textarea>
      <textarea [(ngModel)]="paragraph2" placeholder="Enter second paragraph"></textarea>
      <button (click)="overview.reload()">Compare</button>
      @if (overview.hasValue()) {
          <div>
            <h3>Comparison Result:</h3>
            <p>{{ overview.value() }}</p>
          </div>
      }
    </div>
  `,
  imports: [FormsModule],
})
export class WritingAssistantComponent {
  readonly #genAI = inject(GenAIService);
  paragraph1 = signal('');
  paragraph2 = signal('');
  overview = rxResource({
    stream: () => {
      if (!this.paragraph1() || !this.paragraph2()) {
        return of('');
      }
      return this.#genAI.writingOverview(this.paragraph1(), this.paragraph2());
    },
  });
}

As we can see, a new method named writingOverview is being called on the GenAIService. To implement it, we do not actually need a new endpoint on our simple backend, since we already implemented a generic /generate-text endpoint. All we need to do is to call the method in the service providing the two paragraphs and a prompt.

writingOverview(paragraph1: string, paragraph2: string) {
    const prompt = `Provide a brief overview comparing the following two paragraphs:\n\nParagraph 1: ${paragraph1}\n\nParagraph 2: ${paragraph2}. Decide which version is better\n\nOverview:`;
    return this.generateContent(prompt);
}

Now, when we see the two inputs, we can put "Hello, how are you?" and "hlo hwo aer yo" to test a very clear cut example, and then, in our overview box, we can see something like the following:

**Overview:** Paragraph 1 is a standard, grammatically correct greeting. Paragraph 2 is a heavily abbreviated and misspelled version of the same greeting. **Decision:** Paragraph 1 is significantly better due to its clarity, proper grammar, and readability. Paragraph 2 is difficult to understand and would be considered unprofessional in most contexts.

This is absolutely great! But what if, instead of just showing a response text, we wanted to actually change our UI in accordance to the response? Like, it would be nice from the UX perspective if we could highlight the better paragraph in green, and the worse one in red. But how do we achieve it? I mean, from the text we got, it is pretty obvious that paragraph 1 is the better choice, but how do we translate it into code? Here's where structured outputs come into play.

It would surely be great if instead of plain text, Gemini returned as a JSON of, for example, this form:

{
  "betterParagraph": 1,
  "reason": "Paragraph 1 is significantly better due to its clarity, proper grammar, and readability. Paragraph 2 is difficult to understand and would be considered unprofessional in most contexts."
}

But how do we make it work like this? Surely, we can write our prompt in a way that it asks for a JSON response. However, we might quickly become quite frustrated, since the model will often start the message with "parasite" words like "Sure. here;s your JSON:", or it will simply return a plain text response, or it will return a malformed JSON. Fighting this with prompt engineering might yield some results, but is not worth the time, and might not be fully reliable anyway.

Instead, we can use a feature of the Gemini API called structured outputs. It allows us to define a schema for the response we expect, and the model will guarantee that the response will be in the correct format! To do that, we need to do the following:

  • define a new endpoint to call Gemini
  • provide a response type ("application/json" for our use case)
  • provide the schema (what fields we expect, and what types they are)

Here's a simple implementation:

app.post('/writing-assistant', async (req, res) => {
  const { paragraph1, paragraph2 } = req.body;
  const schema = {
    "type": "object",
    "properties": {
      "overview": { "type": "string" },
      "bestChoice": {
        "type": "string",
        "enum": ["1", "2"]
      }
    },
    "required": ["overview", "bestChoice"]
  };
  const prompt = `Provide a brief overview comparing the following two paragraphs:\n\nParagraph 1: ${paragraph1}\n\nParagraph 2: ${paragraph2}. Decide which version is better\n\nOverview:`;
  try {
    const response = await genAI.models.generateContent({
      model: 'gemini-2.0-flash-lite',
      contents: prompt,
      config: {
        responseMimeType: 'application/json',
        temperature: 0.1,
        responseSchema: schema,
      },
    });
    res.json(response.candidates[0]?.content.parts[0].text || {});
  } catch (error) {
    console.error('Error generating writing overview:', error);
    res.status(500).json({ error: 'Failed to generate writing overview' });
  }
});

As we can see, most of it is the same boilerplate as previously, with two slight differences:

  1. We defined a schema object, which describes the structure of the response we expect
  2. In the config object, we provided two new properties: responseMimeType, which we set to "application/json", and responseSchema, which we set to the schema we defined.

The schema object is pretty straightforward. It is a standard JSON schema, which you can learn more about here. In our case, we expect an object with two properties: overview, which is a string, and bestChoice, which is also a string, but can only be one of two values: "1" or "2". Both properties are required.

Now, we can modify out Angular service to call this new endpoint:

writingOverview(paragraph1: string, paragraph2: string) {
    return this.#http.post<
    GeminiResponse
    >('http://localhost:3000/writing-assistant', {paragraph1, paragraph2});
}

As we can see, the only difference is that instead of simply extracting the text from the response, we parse it as JSON before returning from the backend. This is of course made possible by the structured output we used to instruct Gemini on how to generate the response.

Finally, we can modify our component to use the structured response to change the UI:

@Component({
  selector: 'app-writing-assistant',
  template: `
    <div>
      <h2>Writing Assistant</h2>
       @let value = overview.value();
      <textarea 
        [(ngModel)]="paragraph1" placeholder="Enter first paragraph" 
        [class.better]="value?.bestChoice === '1'" 
        [class.worse]="value?.bestChoice === '2'"></textarea>
      <textarea 
        [(ngModel)]="paragraph2" placeholder="Enter second paragraph" 
        [class.better]="value?.bestChoice === '2'" 
        [class.worse]="value?.bestChoice === '1'"></textarea>
      <button (click)="overview.reload()">Compare</button>
      @if (value.overview) {
          <div>
            <h3>Comparison Result:</h3>
            <p>{{ value.overview }}</p>
          </div>
      }
    </div>
  `,
  styles: `
    .better {
      border: 2px solid green;
    }
    .worse {
      border: 2px solid red;
    }
  `,
  imports: [FormsModule],
})
export class WritingAssistantComponent {
  /* rest of the component code remains unchanged */
}

Note: if you're curious about how LLMs are capable of generating structured outputs so precisely, watch this YouTube video for a very detailed explanation; however, do not worry if you do not understand all the nuances, it is not required to know how structured outputs work to effectively use them

Now we do not only display the overview, but also highlight the better paragraph in green, and the worse one in red! Absolutely amazing and what an intro to both structured outputs and generative UIs!

Of course, this example was quite simplistic. Let's drive the point about usefulness of structured outputs even further, since we can achieve quite spectacular things with LLMs that produce coherent responses and decisions. Next, let's try to build a dynamic form.

Advanced features with structured outputs

Have you ever created a custom Google form, perhaps to collect some feedback, or maybe acquire information about potential participants of an event? If you did, the next part is going to be familiar, yet exciting. We are going to build an interface which allows the user to add questions, provide a format for answers (for simplicity we will have "input", "textarea", and "dropdown" with options, but this can easily be expanded), and also see a live preview of the form - not simply an image, but an actual form the creator can play around with and edit.

To further make this interesting, we will use Angular signal forms, which are poised to enter the scene in v21, and make it extremely simple to create a dynamic form. Finally, our goal is to simplify the process of creating a form by allowing the user to input a description of what they want the form to be about, and then have Gemini generate the questions for them! So, for instance, they might type something like "I want to create a feedback form for my new product", and Gemini will generate a set of questions that would be appropriate for such a form, and we will display the form as it is directly on the same page.

Note: if you are unfamiliar with Angular signal forms, I suggest you read this article by Manfred Steyer first, or watch me live code with signal forms here.

To achieve this, we will need to do the following:

  • a data type that describes what a custom form looks like
  • a prompt to generate a form
  • a schema based on the data type that will be used for structured output
  • a linked signal derived from the structured output of Gemini
  • a form created from that linked signal

Let's do it step by step.

First, let's define a simple data type that describes what a custom form looks like. Since we can have multiple fields, it is reasonable to think of every field descriptor as an object, and the entire form as an array of such objects. Here's what our object might look like:

export interface FormField {
  name: string;
  type: 'input' | 'textarea' | 'dropdown';
  options?: {value: string, label: string}[]; // only for dropdown
}

Very good, now, we need to define a schema for this object. We remember how to do it from the previous example, however, it can be quite cumbersome, and we might inadvertently make some mistakes. So, instead, what we are going to do is head over to Google AI Studio, toggle "Structured Output" on, and click "Edit". In the open popup, we can click on "Visual Editor" and start adding our properties and defining values! We can add properties, define their types, add enum values for strings, and more. Here's what our schema will look like visually:

Schema visual editor

As we can see, all the fields are defined, so what is left is to switch to "Code Editor" and simply copy paste the generated schema into our backend code:

app.post('/generate-form', async (req, res) => {
  const { prompt: query } = req.body;
  if (!query) {
    return res.status(400).json({ error: 'Prompt is required' });
  }
  const prompt = `User will provide a description of a generic form, and you will generate the blueprint. Include only the fields required to add data, do not include buttons. ${query}`;
  const schema = {
  "type": "object",
  "properties": {
    "form": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string"
          },
          "type": {
            "type": "string",
            "enum": [
              "input",
              "textarea",
              "dropdown"
            ]
          },
          "options": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "value": {
                  "type": "string"
                },
                "label": {
                  "type": "string"
                }
              },
              "propertyOrdering": [
                "value",
                "label"
              ],
              "required": [
                "value",
                "label"
              ]
            }
          }
        },
        "propertyOrdering": [
          "name",
          "type",
          "options"
        ],
        "required": [
          "name",
          "type"
        ]
      }
    }
  },
  "propertyOrdering": [
    "form"
  ],
  "required": [
    "form"
  ]
};
  try {
    const response = await genAI.models.generateContent({
      model: 'gemini-2.0-flash-lite',
      contents: [
        { role: 'user', parts: [{ text: prompt }] }
      ],
      config: {
        responseMimeType: 'application/json',
        temperature: 0.1,
        responseSchema: schema, 
      }
    });
    res.json(response.text);
  } catch (error) {
    console.error('Error generating form blueprint:', error);
    res.status(500).json({ error: 'Failed to generate form blueprint' });
  }
});

It's quite obvious that while it's a bit of a long function, the only two differences from the previous example are the prompt and the schema. The rest is the same boilerplate, which is great news, since it means we can now build a component using this endpoint.

@Component({
    selector: 'app-generative-form',
    template: `
        <h2>Generative Form Component</h2>
        <textarea 
            #promptArea (keypress.enter)="prompt.set(promptArea.value)">
        </textarea>
        <button (click)="prompt.set(promptArea.value)">Generate</button>
        @if (formBlueprint.hasValue()) {
            <h3>Generated Form Blueprint:</h3>
            @for (field of formBlueprint.value(); track field.name) {
                <div>
                    <label [for]="field.name">{{field.name | titlecase }}</label>
                    @switch (field.type) {
                        @case ('input') {
                            <input [id]="field.name" [control]="$any(form)[field.name]" />
                        }
                        @case ('textarea') {
                            <textarea [id]="field.name" [control]="$any(form)[field.name]"></textarea>
                        }
                        @case('dropdown') {
                            <select [id]="field.name" [control]="$any(form)[field.name]">
                                @for(option of field.options; track option.value) {
                                    <option [value]="option.value">{{ option.label }}</option>
                                }
                            </select>
                        }
                        @default {
                            <div>Unknown field type: {{field.type}}</div>
                        }
                    }
                </div>
                {{formValue() | json}}
            }
        }
    `,
    imports: [Control, TitleCasePipe, JsonPipe],
})
export class GenerativeFormComponent {
    readonly #genAI = inject(GenAIService);
    prompt = signal('');
    formBlueprint = rxResource({
        params: () => ({prompt: this.prompt()}),
        stream: ({params}) => {
            if (!params.prompt) {
                return of(null);
            }
            return this.#genAI.generateForm(params.prompt);
        },
    });

    formValue = linkedSignal(() => {
        const blueprint = this.formBlueprint.value();
        if (!blueprint) {
            return null;
        }
        const value = {} as Record<string, any>;
        for (const field of blueprint) {
            value[field.name] = '';
        }
        return value;
    });

    form = form(this.formValue);
}

Now, let's carefully examine what we have done here

  1. Create a resource that calls the generateForm method of our service, then stores the value of the form as the array we mentioned
  2. Create a linked signal that derives its value from the form blueprint, and creates an object with keys being the names of the fields, and values being empty strings. This will be the base of the signal form
  3. Create a signal form from the linked signal using the form function; yes, this is that simple with signal forms!
  4. In the template, iterate over form fields and define appropriate controls based on the type of the field using @switch/@case blocks
  5. To be able to dynamically read form controls and bind them to inputs with the [control] directive, we need to use $any, since the form structure is not known at compile time (obviously, it is generated with an LLM)

And this is it! Now, the end user can simply type the description of the form they want to create, live preview it and play around, edit, iterate, and get to their final result! Let's now move to the final topic of this article (which is actually and opening of a whole new world) - function calling.

Function calling

Funnily enough, function or tool calling, while a fundamental pillar of building AI-powered applications and agentic workflows, is a feature that can essentially be thought of as a subset of structured outputs. Let's understand what it is in general and how it (slightly) differs from a structured output.

  • Structured outputs allow us to define a schema for the response we expect from the model, and the model will generate a response that adheres to that schema
  • Function calling lets us tell the LLM what functions we have available (like methods in frontend or backend), and the model will decide which one (API, database, search tool) it needs to call
  • We could theoretically achieve the same result by using structured output, however, with function calling, we can get both output (even structured) and function calls, thus being able to not only call functions, but show explainer messages and prompts from the model

So, let's start building an app command line, where user can input prompts that wil do commands in our app, like navigating to a certain page, changing settings, and more. The very first thing we will implement is the navigation command.

To achieve this, we need to define a schema that describes our function and its parameters. Here's what it might look like:

[
  {
    "name": "navigate",
    "description": "Navigates the user to the page defined by the URL",
    "parameters": {
      "type": "object",
      "properties": {
        "url": {
          "type": "string",
          "enum": [
            "/some-url",
            "/another-url",
            "/yet-another-url"
          ]
        }
      },
      "required": [
        "url"
      ],
      "propertyOrdering": [
        "url"
      ]
    }
  }
]

As we can see, it is almost identical to what we provided for structured output, with the only difference being that we have a name and description properties at the top level, which describe the function we are defining. These are of paramount importance, since the LLM will use those parameters to decide which function(s) to call out of many provided.

However, this setup is not very useful, since we provided some mock urls, but the URLs that our Angular app actually have are quite different, and also dynamic in the sense that in the future, more routes might become available, or old ones become obsolete.

To counter this, the best thing we can do is to generate the schema dynamically, based on the actual routes our Angular app has. To do that, we can use the Router service, and extract the routes from it. Here's how we can do it:

getRoutesSchema() {
    const routes = this.#router.config
      .filter(route => route.path) // filter out routes without a path
      .map(route => `/${route.path}`); // prepend '/' to each path
    return {
      name: 'navigate',
      description: 'Navigates the user the page defined by the URL',
      parameters: {
        type: 'object',
        properties: {
          url: {
            type: 'string',
            enum: routes,
          },
        },
        required: ['url'],
        propertyOrdering: ['url'],
      },
    };
}

Before we make this work, let's again think about what is going here. We are creating and object that describes a function that the LLM may or may not (this is important!) choose to invoke. When we say "invoke", in this context we do not mean it will actually invoke it, but rather tell us (our app) to do that (we are still able to opt out of actually doing it). The function is navigate, and it takes a single parameter url, which is a string, and can be one of the routes we extracted from the Angular Router. So, simply put, the LLM notifies the caller to execute navigate(url) to navigate to a page.

Now, we need to add a simple endpoint that will just pick up the user's prompt and the schema we dynamically obtained and call Gemini:

app.post('/command-line', async (req, res) => {
  const { commands, prompt } = req.body;

  if (!commands) {
    return res.status(400).json({ error: 'Commands are required' });
  }

  const finalPrompt = `
    You are an assistants who helps users execute commands in a web app defined in natural language. Take a look at the tools available and the user's prompt and decide which ones to call with what arguments.

    User prompt: ${prompt}
  `;

  const response = await genAI.models.generateContent({
    model: 'gemini-2.0-flash-lite',
    contents: prompt,
    config: {
      temperature: 0.1,
      tools: [{functionDeclarations: commands}],
    },
  });

  res.json(response);
});

As we can see, this time it's even simpler, as the bulk of the works is done in the frontend, and provided to Gemini as a toolset it can call. Here we also slightly augment the user's prompt to make Gemini focus on choosing a right tool, since it is a command tool rather than a chatbot.

Finally, we can implement the component that will provide the user interface for our command line:

@Component({
    selector: 'app-command-line',
    template: `
        <input (keyup.enter)="onEnter($event)" />
    `,               
})
export class CommandLineComponent {
    readonly #router = inject(Router);
    readonly #genAI = inject(GenAIService);

    onEnter(event: Event) {
        const input = (event.target as HTMLInputElement).value;
        const commands = [this.getRoutesSchema()];
        // callCommandLine simply takes the user's input and the commands schema and calls the /command-line endpoint we defined above 
        this.#genAI.callCommandLine(commands, input).subscribe(response => {
            const command = response.candidates[0]?.content.parts[0].functionCall;
            this.handleCommand(command);
        });
    }

    handleCommand(command?: {name: string; args: Record<string, any>}) {
        switch (command.name) {
            case 'navigate':
                const url = command.args['url'];
                // we might want to validate the URL here before navigating
                // since LLMs can sometimes hallucinate, in a production setting
                // it would be a good idea to check if the URL actually exists 
                // within our app routes 
                this.#router.navigateByUrl(url);
                break;
            default:
                console.warn(`Unknown command: ${command.name}`);
        }
    }

    getRoutesSchema() {
        // omitted for the sake of brevity, see above
    }
}

As we can see, the response.candidates[0]?.content.parts[0] item now contains a functionCall property, which is the function Gemini decided needs to be called and passed back to us. We then handle that result in a simple switch/case block, and call the appropriate Angular method - in our case, Router.navigateByUrl.

If we now open the component in a browser and type something like "navigate to writing assistant", the app will navigate to the writing assistant component we built previously! Absolutely amazing. Of course, for now we only have one command, but here is another that you can now implement on your own: a command that changes the theme of the app (light/dark). You can define a simple service that holds the current theme, and then define a function schema for changing the theme, and implement the command in the handleCommand method.

Conclusion

We built a lot on top of what we already had in the previous article. We learned:

  • structured outputs that can help us force the model to generate responses in a certain format, and how to use them to build generative UIs
  • function calling, which allows us to define functions (or tools) that the model can call
  • we touched very slightly on prompt engineering, augmenting the user's prompt to make it work better for our use case
  • we began a journey into agentic workflows, where the model can make decisions and call functions based on the user's input to come up with way more complex results than simply text generation

In the next article, we will go super deep and discuss embeddings - special numerical representations of text that allow us to do some absolutely spectacular things, like semantic search, text classification, and more. Stay tuned!

Small Promotion

Gg2RPJKWwAAHSId.png
My book, Modern Angular, is now in print! I spent a lot of time writing about every single new Angular feature from v12-v18, including enhanced dependency injection, RxJS interop, Signals, SSR, Zoneless, and way more.

If you work with a legacy project, I believe my book will be useful to you in catching up with everything new and exciting that our favorite framework has to offer. Check it out here: https://www.manning.com/books/modern-angular

P.S There is one chapter in my book that helps you work LLMs in the context of Angular apps; that chapter is already kind of outdated, despite the book being published just earlier this year (see how insanely fast-paced the AI landscape is?!). I hope you can forgive me ;)


Tagged in:

Articles

Last Update: November 06, 2025