GraphQL, like any technology, has its problems, some of them directly result from the architecture and some are identical to what we see in any other application. However, the solutions are completely different.
To present the problem, let’s assume the following application architecture:
And here the corresponding query in GraphQL to download the data. We fetch all links, along with the poster and its links added to the system,
{
allLinks {
id
url
description
createdAt
postedBy {
id
name
links {
id
}
}
}
}
As displayed below, we can see the classic n + 1 problem with relations here.
Link Load (0.4ms) SELECT "links".* FROM "links" ORDER BY created_at DESC
↳ app/controllers/graphql_controller.rb:5:in `execute'
User Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."id" = ? LIMIT ? [["id", 40], ["LIMIT", 1]]
↳ app/controllers/graphql_controller.rb:5:in `execute'
Link Load (0.3ms) SELECT "links".* FROM "links" WHERE "links"."user_id" = ? [["user_id", 40]]
↳ app/controllers/graphql_controller.rb:5:in `execute'
User Load (0.1ms) SELECT "users".* FROM "users" WHERE "users"."id" = ? LIMIT ? [["id", 38], ["LIMIT", 1]]
↳ app/controllers/graphql_controller.rb:5:in `execute'
Link Load (0.1ms) SELECT "links".* FROM "links" WHERE "links"."user_id" = ? [["user_id", 38]]
↳ app/controllers/graphql_controller.rb:5:in `execute'
User Load (0.2ms) SELECT "users".* FROM "users" WHERE "users"."id" = ? LIMIT ? [["id", 36], ["LIMIT", 1]]
↳ app/controllers/graphql_controller.rb:5:in `execute'
Link Load (0.1ms) SELECT "links".* FROM "links" WHERE "links"."user_id" = ? [["user_id", 36]]
↳ app/controllers/graphql_controller.rb:5:in `execute'
User Load (0.1ms) SELECT "users".* FROM "users" WHERE "users"."id" = ? LIMIT ? [["id", 34], ["LIMIT", 1]]
↳ app/controllers/graphql_controller.rb:5:in `execute'
Link Load (0.2ms) SELECT "links".* FROM "links" WHERE "links"."user_id" = ? [["user_id", 34]]
↳ app/controllers/graphql_controller.rb:5:in `execute'
User Load (0.1ms) SELECT "users".* FROM "users" WHERE "users"."id" = ? LIMIT ? [["id", 32], ["LIMIT", 1]]
In this case, it works exactly like this piece of code: Link.all.map(&:user).map(&:links).
We seem to know the solution to the problem: Link.includes(user: :links).map(&:user).map(&:links), but will it really work? Let’s check it out!
To verify the fix, I changed the GraphQL query to only use a few fields and no relation.
{
allLinks {
id
url
description
createdAt
}
}
Unfortunately, the result shows that, despite the lack of links in relation to the user and their links, we still attach this data to database query. Unfortunately, they are redundant and, with an even more complicated structure, it turns out to be simply inefficient.
In GraphQL, such problems are solved differently,simply by loading data in batches, assuming that the data is needed when it is put in the query. It is such a lazy loading. One of the most popular libraries is https://github.com/Shopify/graphql-batch/.
Unfortunately, its installation is not as hassle-free as it may seem. The data loaders are available here: https://github.com/Shopify/graphql-batch/tree/master/examples, I mean the RecordLoader class and theAssociationLoader class. Let’s classically install the gem 'graphql-batch' library and then add it to our schema, as well as loaders:
# graphql-ruby/app/graphql/graphql_tutorial_schema.rb
class GraphqlTutorialSchema < GraphQL::Schema
query Types::QueryType
mutation Types::MutationType
use GraphQL::Batch
...
end
And our types:
# graphql-ruby/app/graphql/types/link_type.rb
module Types
class LinkType < BaseNode
field :created_at, DateTimeType, null: false
field :url, String, null: false
field :description, String, null: false
field :posted_by, UserType, null: false, method: :user
field :votes, [Types::VoteType], null: false
def user
Loaders::RecordLoader.for(User).load(object.user_id)
end
end
end
# graphql-ruby/app/graphql/types/user_type.rb
module Types
class UserType < BaseNode
field :created_at, DateTimeType, null: false
field :name, String, null: false
field :email, String, null: false
field :votes, [VoteType], null: false
field :links, [LinkType], null: false
def links
Loaders::AssociationLoader.for(User, :links).load(object)
end
end
end
As a result of using the loaders, we batch the data and we query for data in two simple sql queries:
There are also other solutions that solve this problem, such as:
Complexity of queries
N + 1 queries are not everything, in GraphQL we can freely carry over the next attributes. By default, it set to 1. This can sometimes be too much for the server, especially in a situation where we can freely nest data. How to deal with it? We can limit the complexity of the query, but to do this, we also need to specify their cost in the attributes. By default it set to 1. We set this cost using the complexity: attribute, where we can enter data: field: links, [LinkType], null: false, complexity: 101. If limiting is to actually work, you still need to introduce the maximum limit to your scheme:
class GraphqlTutorialSchema < GraphQL::Schema
query Types::QueryType
mutation Types::MutationType
use GraphQL::Batch
max_complexity 100
...
end
Tracing
GraphQL processes queries differently, and tracing is not that simple if compares to what we can do locally. Unfortunately, the rack mini profiler or a regular SQL log will not tell us everything and will not point which part of the query is responsible for a given time slice. In the case of GraphQL-Ruby, we can use commercial solutions available here: https://graphql-ruby.org/queries/tracing, or try to prepare our own tracing. Below, the snippet looks like a local tracer.
# lib/my_custom_tracer.rb
class MyCustomTracer < GraphQL::Tracing::PlatformTracing
self.platform_keys = {
'lex' => 'graphql.lex',
'parse' => 'graphql.parse',
'validate' => 'graphql.validate',
'analyze_query' => 'graphql.analyze_query',
'analyze_multiplex' => 'graphql.analyze_multiplex',
'execute_multiplex' => 'graphql.execute_multiplex',
'execute_query' => 'graphql.execute_query',
'execute_query_lazy' => 'graphql.execute_query_lazy'
}
def platform_trace(platform_key, key, _data, &block)
start = ::Process.clock_gettime ::Process::CLOCK_MONOTONIC
result = block.call
duration = ::Process.clock_gettime(::Process::CLOCK_MONOTONIC) - start
observe(platform_key, key, duration)
result
end
def platform_field_key(type, field)
"graphql.#{type.graphql_name}.#{field.graphql_name}"
end
def platform_authorized_key(type)
"graphql.authorized.#{type.graphql_name}"
end
def platform_resolve_type_key(type)
"graphql.resolve_type.#{type.graphql_name}"
end
def observe(platform_key, key, duration)
return if key == 'authorized'
puts "platform_key: #{platform_key}, key: #{key}, duration: #{(duration * 1000).round(5)} ms".yellow
end
end
Installation is also extremely simple, you need to include the tracer information in the schema tracer (MyCustomTracer.new) configuration. As in the example below:
# graphql-ruby/app/graphql/graphql_tutorial_schema.rb
class GraphqlTutorialSchema < GraphQL::Schema
query Types::QueryType
mutation Types::MutationType
use GraphQL::Batch
tracer(MyCustomTracer.new)
...
end
The output from such tracing looks like this:
Started POST "/graphql" for ::1 at 2021-06-17 22:02:44 +0200
(0.1ms) SELECT sqlite_version(*)
Processing by GraphqlController#execute as */*
Parameters: {"query"=>"{n allLinks {n idn urln descriptionn createdAtn postedBy {n idn namen links {n idn }n }n }n}", "graphql"=>{"query"=>"{n allLinks {n idn urln descriptionn createdAtn postedBy {n idn namen links {n idn }n }n }n}"}}
platform_key: graphql.lex, key: lex, duration: 0.156 ms
platform_key: graphql.parse, key: parse, duration: 0.108 ms
platform_key: graphql.validate, key: validate, duration: 0.537 ms
platform_key: graphql.analyze_query, key: analyze_query, duration: 0.123 ms
platform_key: graphql.analyze_multiplex, key: analyze_multiplex, duration: 0.159 ms
Link Load (0.4ms) SELECT "links".* FROM "links"
↳ app/graphql/graphql_tutorial_schema.rb:21:in `platform_trace'
platform_key: graphql.execute_query, key: execute_query, duration: 15.562 ms
↳ app/graphql/loaders/record_loader.rb:12:in `perform'
↳ app/graphql/loaders/association_loader.rb:46:in `preload_association'
platform_key: graphql.execute_query_lazy, key: execute_query_lazy, duration: 14.12 ms
platform_key: graphql.execute_multiplex, key: execute_multiplex, duration: 31.11 ms
Completed 200 OK in 48ms (Views: 1.2ms | ActiveRecord: 2.0ms | Allocations: 40128)
Summary
GraphQL is not a new technology anymore, but the solutions to its problems are not fully standardized if they are not part of the library. The implementation of this technology in the project gives a lot of opportunities to interact with the frontend and I personally consider it to be a new quality in relation to what REST API offers.