GraphQL: Understanding, Implementing, and Best Practices

Web Chapter

EngineeringJul 10, 2024.20 min read

A step-by-step understanding around GraphQL best practices.

GraphQL offers a more flexible and efficient approach compared to traditional RESTful services. However, to fully leverage its capabilities, it's crucial to follow established best practices and understand potential pitfalls. This includes ensuring consistent naming conventions, designing flexible and scalable schemas, optimizing queries, handling errors gracefully, and implementing effective caching and versioning strategies.

At its core, GraphQL is a query language for your API, allowing clients to request exactly the data they need and nothing more. This selective data retrieval reduces over-fetching and under-fetching issues common in REST APIs.

While Apollo GraphQL’s official documentation explains how to setup both Apollo Client and Server to send and handle requests at https://www.apollographql.com/docs/, this blog post will explore setting up Apollo with React, some best practices for state management and caching, the benefits of using fragments, and some common pitfalls to avoid.

How GraphQL Works

A GraphQL server exposes a single endpoint and utilizes a schema to define the types of data that can be queried or mutated. Clients send queries or mutations to this endpoint, and the server resolves these requests based on the schema, often aggregating data from multiple sources.

Schema Definition: The schema is the backbone of a GraphQL API, defining the types, queries, and mutations available. It serves as a contract between the client and the server.
Resolvers: Resolvers are functions that handle the logic for fetching the data defined in the schema. Each field in a GraphQL query corresponds to a resolver function.
Query Execution: When a query is executed, the GraphQL engine parses it, validates it against the schema, and calls the appropriate resolvers to fetch the requested data.
Performance Optimization: GraphQL allows for efficient data fetching by enabling clients to request multiple resources in a single query, reducing network overhead and improving application performance
Schema Design: Carefully design your GraphQL schema to ensure it's flexible, scalable, and aligns with your application's needs. A well-structured schema improves query efficiency and maintainability by using clear and consistent naming, designing around client use cases, avoiding implementation details, and providing thorough documentation. It also involves planning for future changes and grouping related fields to enhance usability and avoid breaking changes.

What are the key principles of schema design in GraphQL?

Designing a GraphQL schema with flexibility and scalability in mind involves several key principles that ensure the schema can grow and adapt to changing requirements while maintaining performance and usability.

Unified Schema: A GraphQL schema defines a collection of types and their relationships in a unified manner. This allows client developers to request specific subsets of data with optimized queries, enhancing flexibility.
Implementation-Agnostic: The schema is not responsible for defining data sources or storage methods, making it adaptable to various backend implementations without requiring changes to the schema itself.
Field Nullability: By default, fields can return null, but non-nullable fields can be specified using an exclamation mark (!). This provides control over data integrity and error handling, contributing to robust and scalable schema design.
Query-Driven Design: The schema should be designed based on client needs rather than backend data structures. This approach ensures that the schema evolves with client requirements, supporting flexibility and scalability.
Version Control and Change Management: Maintaining the schema in version control allows tracking of changes over time. Most additive changes are backward compatible, but careful management of breaking changes is essential for scalability.
Use of Descriptions: Incorporating Markdown-enabled documentation strings (descriptions) in the schema helps developers understand and use the schema effectively, promoting a flexible development environment.
Naming Conventions: Consistent naming conventions, such as camelCase for fields and PascalCase for types, ensure clarity and ease of use across different client implementations, aiding in scalability.

Apollo GraphQL Setup

Apollo Client is a powerful tool for managing GraphQL operations on the client-side. It provides a robust set of features for state management, caching, and error handling, which are crucial for building scalable and efficient GraphQL applications.

As effective state management and caching are critical for building performant GraphQL applications. Apollo Client excels in these areas with several built-in features:

Normalized Caching: Apollo Client normalizes the data it receives from the server, storing it in a flat structure. This allows for efficient data retrieval and minimizes redundancy.
Local State Management: Apollo Client can manage both remote and local state, allowing you to leverage GraphQL for application-wide state management.

// src/apollo-client.ts
import { ApolloClient, InMemoryCache } from '@apollo/client';

const client = new ApolloClient({
  uri: '<https://your-graphql-endpoint.com/graphql>',
  cache: new InMemoryCache(),
});

export default client;

Create Apollo Client

// src/App.tsx
import React from 'react';
import { ApolloProvider } from '@apollo/client';
import client from './apollo-client';
import YourComponent from './YourComponent';

const App: React.FC = () => (
  <ApolloProvider client={client}>
    <YourComponent />
  </ApolloProvider>
);

export default App;

Using Apollo Client in Components

// src/YourComponent.tsx
import React from 'react';
import { useQuery, gql } from '@apollo/client';

const GET_DATA = gql`
  query GetData {
    data {
      id
      name
    }
  }
`;

interface Data {
  data: {
    id: string;
    name: string;
  }[];
}

const YourComponent: React.FC = () => {
  const { loading, error, data } = useQuery<Data>(GET_DATA);

  if (loading) return <p>Loading...</p>;
  if (error) return <p>Error: {error.message}</p>;

  return (
    <div>
      {data?.data.map(item => (
        <div key={item.id}>{item.name}</div>
      ))}
    </div>
  );
};

export default YourComponent;

Query Example with TypeScript

Apollo Client seamlessly integrates with React's component model and hooks, making it a natural fit for managing server state in React applications.

When setting up Apollo Client, consider implementing automatic persisted queries (APQ) to optimize network usage and improve security. APQ can significantly reduce bandwidth consumption and enhance overall performance, especially for large applications. By sending unique identifiers (SHA-256 hashes) of queries instead of the full query strings, APQs reduce request sizes and latency. Additionally, APQs can be integrated with Content Delivery Networks (CDNs) and caching mechanisms to further improve performance.

How can I optimize GraphQL queries for better performance?

GraphQL query performance can be significantly optimized through the use of batching and caching techniques. These methods help reduce server load, improve response times, and mitigate common issues such as the N+1 query problem.

Batching

Combining Multiple Queries: Batching involves merging multiple queries into a single request. This reduces the number of network round trips required, which is particularly beneficial when retrieving related data for multiple records.
Implementing Batched Data Fetching: This technique combines multiple operations into a single request, minimizing the overhead associated with each request. It helps address the N+1 query problem by reducing the number of database round trips.
Using DataLoader: DataLoader is a utility library that implements batched data fetching and caching. It efficiently loads data from a database and reduces redundant data fetching, improving query performance.

Caching

Query Caching: This process involves storing the results of a GraphQL query for a certain period and returning the cached results for subsequent identical requests. It reduces response times and minimizes server workload by avoiding repeated execution of the same query.
Reducing Server Requests: By caching frequently accessed data on the client or server side, the number of requests made to the server is reduced, leading to faster response times and decreased server load.

Best Practices for GraphQL Implementation

In this section, we will go through using query params to filter the results of our queries, how to use fragments to improve the way queries and mutations can be organized, and how to leverage cache to lazy-load nested fields for better performance and less loading times (better UX).

Design Your Schema Effectively

Use clear and consistent naming conventions, such as camelCase for fields and PascalCase for types.
Keep the schema simple and flexible, avoiding overly complicated data models. Utilize interfaces and unions to represent shared features and mix different types.

What are examples of well designed GraphQL schemas?

The following are examples of well-designed GraphQL schemas that adhere to best practices for naming conventions and future-proofing:

Schema Naming Conventions

Consistency: Ensure that naming conventions are consistent across the entire schema.
Specificity: Use specific names to avoid broad applicability and potential conflicts.
Avoid Acronyms: Refrain from using acronyms, initialisms, and abbreviations.

Casing

camelCase: Use for field names, argument names, and directive names. Example:

type Query { myCamelCaseFieldNames(myArgumentName: String): String }

PascalCase: Use for type names. Example:

type MyType { ... } enum MyEnum { ... }

SCREAMING_SNAKE_CASE: Use for enum values. Example:

enum MyEnum { VALUE_ONE VALUE_TWO }

Field Names

Query Fields: Avoid verb prefixes like get or list. Example:

type Query { products: [Product] }

Mutation Fields: Start with a verb. Example:

type Mutation { addCustomer(input: AddCustomerInput): AddCustomerPayload! }

Type Names

Input Types: Use the suffix Input. Example:

input AddCustomerInput { name: String! }

Output Types: Use a consistent suffix like Response or Payload. Example:

type AddCustomerResponse { success: Boolean! }

Additional Considerations

Namespacing: Use PascalCase or Single_Underscore prefix to resolve naming conflicts between different domains. Examples:

type StoreCustomer { ... } type SiteCustomer { ... }

type Store_Customer { ... } type Site_Customer { ... }

Optimize Your Queries

Avoid over-fetching by ensuring queries are focused and only request the necessary data.
Implement strategies such as batching, caching, and pagination to improve performance.

How can optimizing GraphQL queries improve API performance?

Optimizing GraphQL queries can significantly enhance API performance by addressing several key areas:

Client-Side Caching: Implementing caching strategies on the client side can improve performance and ensure a consistent user interface.
GET Requests for Queries: Utilizing GET requests for query operations can leverage HTTP caching and CDN usage. This is particularly effective when combined with hashed query documents to reduce data transmission size.
Addressing the N+1 Problem: GraphQL's design can lead to multiple requests to data sources (N+1 problem). This can be mitigated by batching requests using tools like Facebook's DataLoader, which optimizes data fetching by combining multiple requests into a single one.
Demand Control: Implementing mechanisms such as pagination, limiting operation depth and breadth, and analyzing query complexity can prevent excessive load on server resources from complex operations.
Response Compression: Using GZIP to compress JSON-formatted responses can reduce the size of data transmitted over the network, enhancing performance.
Performance Monitoring: Utilizing tools like OpenTelemetry for monitoring can help identify bottlenecks and optimize API performance by providing insights into request execution metrics, logs, and traces.

By focusing on these optimization strategies, GraphQL APIs can achieve improved performance, reduced latency, and better resource management.

Handle Errors Gracefully

Provide clear error messages and follow best practices for error handling.
Allow fields and arguments to be optional to avoid errors and ensure null-safety.

Secure Your API

Implement authentication, authorization, and input validation to protect the API.
Avoid hard-coded parameters; use variables instead to protect sensitive information.

Client-Side Considerations

Write focused queries and mutations, implement caching, and handle errors on the client side.
Use pagination for long lists and navigate through pages with cursor-based navigation.

Versioning and Evolution

Deprecate old fields and introduce new fields with aliases.
Maintain a change log and ensure changes do not break existing functionality, allowing specific clients to opt into breaking changes.

What is the importance of versioning in the evolution of GraphQL APIs?

Versioning in GraphQL APIs is a topic of debate, with many experts advocating for a version-less approach. Here are the key insights on why versioning is important and how it is approached in the evolution of GraphQL APIs:

Continuous Evolution: GraphQL APIs often adopt a continuous evolution approach, which involves maintaining a single version and evolving it over time. This method emphasizes backward compatibility and minimizes disruption for clients.
Commitment to Contracts: A strong commitment to maintaining API contracts is crucial. This involves making additive changes that are backward compatible, reducing the need for breaking changes.
Deprecation and Change Management: When breaking changes are necessary, GraphQL APIs rely on deprecation notices and a structured change management process to transition clients smoothly.
Advantages of GraphQL: GraphQL supports continuous evolution through features like first-class deprecation support and detailed usage tracking, which facilitate smoother transitions and client adaptation.
Versioning as a Last Resort: While versioning can provide a sense of security, it often leads to increased complexity and maintenance challenges. In GraphQL, versioning is considered a last resort, with continuous evolution preferred for its flexibility and reduced client impact.

Overall, the preference for continuous evolution over versioning in GraphQL APIs is due to its ability to maintain stability and reduce the burden on clients, while still allowing for necessary changes through careful management and communication.

Implementing these best practices not only improves code organization but also enhances the overall efficiency of your GraphQL implementation. By optimizing queries and leveraging caching strategies, you can significantly reduce latency and improve application performance.

export const groupAdminQuery = gql`
  query user(
	  $username: String!
    $category: ID
    $onlyVerified: Boolean
  ) {
      _id
      firstName
      lastName
      subscriptions (
        onlyVerified: $onlyVerified
        category: $category
      ) {
        _id
        title
        status
        progress
      }
  }
`;

Query Params

subscriptions here, is a nested field that we expect when fetching the user.

we pass the parameters needed when requesting the user of a specific username, when using the provided useQuery hook as second argument.

const {
    data,
    loading,
    error,
  } = useQuery(groupAdminQuery, {
    variables: {
	    username: username
      category: categoryId,
      onlyVerified: true,
    },
  });

The username is the only required parameter here, but we will be fetching a certain user, with his subscriptions. In this case, we will also only search for the list of subscriptions that are verified and belong to a certain category.

Using Fragments

Fragments in GraphQL are reusable units of logic that can be shared across multiple queries and mutations. They offer several benefits:

Reusability: Define common fields in a fragment and use them across different queries or mutations.
Maintainability: Update the fragment in one place, and all queries/mutations using it will reflect the change.
Optimized Queries: Fragments help in reducing query complexity and ensure consistent data fetching patterns.

import { gql } from '@apollo/client';

const ITEM_FRAGMENT = gql`
fragment ItemFragment on Item {
	id
	name
}
`
;
const GET_ITEMS = gql`
query GetItems {
	items {
		...ItemFragment
	}
}
${ITEM_FRAGMENT}
`
;
const ADD_ITEM = gql`
mutation AddItem($name: String!) {
	addItem(name: $name) {
		...ItemFragment
	}
}
${ITEM_FRAGMENT}
`
;

What is the role of fragments in GraphQL?

In GraphQL, fragments are a set of fields that can be reused across multiple queries and mutations. They are particularly beneficial when colocated with components to define the component's data requirements. This allows for a more modular and maintainable codebase.

Why are fragments considered a best practice?

Reusability: Fragments allow developers to reuse common sets of fields across different queries and mutations, reducing redundancy and potential errors.
Consistency: By using fragments, any changes to the fields within a fragment automatically propagate to all operations that use it, ensuring consistency across the application.
Colocation with Components: Fragments can be colocated with the components that use them, making it clear which data a component depends on and ensuring that components request only the data they need.
Maintainability: Fragments make it easier to manage and update the GraphQL schema as the application evolves, as changes need to be made in only one place.

Overall, fragments enhance the efficiency and scalability of GraphQL applications by promoting code reuse and reducing the likelihood of errors.

Example of Cache Update:

As we noticed in the query params section, nested fields need to hit the database several times, so naturally the performance would suffer if we overuse them.

So a workaround is to first only load the data required for the page needed.

/**
 * retrieves the user's identity information.
 */
export const meQuery = gql`
  query MeQuery {
    me {
      _id
      isAdmin
      identifyInfo {
	      ...IdentifyInfoFragment
      }
      subscriptions {
	      ...SubscriptionsFragment
      }
      notifications {
	      ...NotificationsFragment
      }
    }
  }
`;

Let’s say that in the welcome page, we never see the subscriptions. Or we want a separate loading time for the UI component that would show them. This would allow the user to interact with the rest of the page. So it makes sense to not fetch them in the beginning as this will only make the loading time longer for data that is more essential to user interaction.

/**
 * retrieves the user's identity information.
 */
export const meQuery = gql`
  query MeQuery {
    me {
      _id
      isAdmin
      identityInfo {
        username
        firstName
        lastName
        birthDate
        gender
        email
        picture
      }
    }
  }
`;

/**
 * retrieves the user's notifications.
 */
export const userSubscriptions = gql`
  query userSubscriptions {
    mySubscriptions: me {
      _id
      subscriptions {
        _id
        status
        title
      }
    }
  }
`;

One way to only call userSubscriptions when meQuery is done, is to call them separately, but ask graphql to wait using skip option set to true if the first fetching is still loading, and then merge them into the same object.

PS: Notice that in the example above, we mention the me field right after “myNotifications”. This way GraphQL knows that we are fetching data that belongs to the same object

const {
    data,
    loading,
    error,
  } = useQuery(meQuery);
  
  const {
    data: subscriptionsData,
    loading: subscriptionsLoading,
    error: subscriptionsError,
  } = useQuery(userSubscriptions, {
	   skip: loading,
  });

Below, when defining the InMemoryCache for ApolloClient, we specify which fields we want to merge with the already cached data when fetching.

const cache = new InMemoryCache({
  typePolicies: {
    Query: {
      fields: {
	      // object example
	      me: {
          merge(existing, incoming, { mergeObjects }) {
            return mergeObjects(existing, incoming);
          },
        },
        // array example
        tasks: {
          merge(existing = [], incoming: any[]) {
            return [...existing, ...incoming];
          },
        },
      },
    },
  },
});

The me example is for loading extra parameters into an already fetched object.

And the tasks is for merging into an array. Which is useful for pagination or loading on scroll for example.

Conclusion

While GraphQL offers numerous advantages, there are pitfalls to watch out for:

Over-fetching with Nested Queries: It’s easy to create deeply nested queries that fetch large amounts of data. This can lead to performance issues. Always be mindful of the depth and breadth of your queries.
Handling Errors: GraphQL doesn’t natively handle HTTP status codes like REST. Ensure you implement robust error handling in your client and server to manage GraphQL-specific errors properly.
Rate Limiting: Without proper rate limiting, GraphQL servers can be susceptible to Denial of Service (DoS) attacks. Implement rate limiting on your server to prevent abuse.
Schema Bloat: As your GraphQL API grows, it can become bloated with many types and fields. Regularly review and refactor your schema to keep it maintainable (naming conventions and can be very helpful here).
N+1 Query Problem: This occurs when querying for nested resources, leading to multiple database requests. Use tools like DataLoader to batch and cache database requests efficiently.

Despite these challenges, GraphQL remains a powerful tool for API development when implemented correctly. By following best practices and staying aware of potential pitfalls, developers can leverage GraphQL to create efficient, flexible, and performant applications that meet the evolving needs of modern software development.

Share this story