D7net
Home
Console
Upload
information
Create File
Create Folder
About
Tools
:
/
proc
/
self
/
root
/
opt
/
alt
/
postgresql11
/
usr
/
share
/
doc
/
alt-postgresql11-9.2.24
/
html
/
Filename :
gin-tips.html
back
Copy
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <HTML ><HEAD ><TITLE >GIN Tips and Tricks</TITLE ><META NAME="GENERATOR" CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK REV="MADE" HREF="mailto:pgsql-docs@postgresql.org"><LINK REL="HOME" TITLE="PostgreSQL 9.2.24 Documentation" HREF="index.html"><LINK REL="UP" TITLE="GIN Indexes" HREF="gin.html"><LINK REL="PREVIOUS" TITLE="Implementation" HREF="gin-implementation.html"><LINK REL="NEXT" TITLE="Limitations" HREF="gin-limit.html"><LINK REL="STYLESHEET" TYPE="text/css" HREF="stylesheet.css"><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"><META NAME="creation" CONTENT="2017-11-06T22:43:11"></HEAD ><BODY CLASS="SECT1" ><DIV CLASS="NAVHEADER" ><TABLE SUMMARY="Header navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TH COLSPAN="5" ALIGN="center" VALIGN="bottom" ><A HREF="index.html" >PostgreSQL 9.2.24 Documentation</A ></TH ></TR ><TR ><TD WIDTH="10%" ALIGN="left" VALIGN="top" ><A TITLE="Implementation" HREF="gin-implementation.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="10%" ALIGN="left" VALIGN="top" ><A HREF="gin.html" ACCESSKEY="U" >Up</A ></TD ><TD WIDTH="60%" ALIGN="center" VALIGN="bottom" >Chapter 55. GIN Indexes</TD ><TD WIDTH="20%" ALIGN="right" VALIGN="top" ><A TITLE="Limitations" HREF="gin-limit.html" ACCESSKEY="N" >Next</A ></TD ></TR ></TABLE ><HR ALIGN="LEFT" WIDTH="100%"></DIV ><DIV CLASS="SECT1" ><H1 CLASS="SECT1" ><A NAME="GIN-TIPS" >55.4. GIN Tips and Tricks</A ></H1 ><P ></P ><DIV CLASS="VARIABLELIST" ><DL ><DT >Create vs. insert</DT ><DD ><P > Insertion into a <ACRONYM CLASS="ACRONYM" >GIN</ACRONYM > index can be slow due to the likelihood of many keys being inserted for each item. So, for bulk insertions into a table it is advisable to drop the GIN index and recreate it after finishing bulk insertion. </P ><P > As of <SPAN CLASS="PRODUCTNAME" >PostgreSQL</SPAN > 8.4, this advice is less necessary since delayed indexing is used (see <A HREF="gin-implementation.html#GIN-FAST-UPDATE" >Section 55.3.1</A > for details). But for very large updates it may still be best to drop and recreate the index. </P ></DD ><DT ><A HREF="runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM" >maintenance_work_mem</A ></DT ><DD ><P > Build time for a <ACRONYM CLASS="ACRONYM" >GIN</ACRONYM > index is very sensitive to the <TT CLASS="VARNAME" >maintenance_work_mem</TT > setting; it doesn't pay to skimp on work memory during index creation. </P ></DD ><DT ><A HREF="runtime-config-resource.html#GUC-WORK-MEM" >work_mem</A ></DT ><DD ><P > During a series of insertions into an existing <ACRONYM CLASS="ACRONYM" >GIN</ACRONYM > index that has <TT CLASS="LITERAL" >FASTUPDATE</TT > enabled, the system will clean up the pending-entry list whenever the list grows larger than <TT CLASS="VARNAME" >work_mem</TT >. To avoid fluctuations in observed response time, it's desirable to have pending-list cleanup occur in the background (i.e., via autovacuum). Foreground cleanup operations can be avoided by increasing <TT CLASS="VARNAME" >work_mem</TT > or making autovacuum more aggressive. However, enlarging <TT CLASS="VARNAME" >work_mem</TT > means that if a foreground cleanup does occur, it will take even longer. </P ></DD ><DT ><A HREF="runtime-config-client.html#GUC-GIN-FUZZY-SEARCH-LIMIT" >gin_fuzzy_search_limit</A ></DT ><DD ><P > The primary goal of developing <ACRONYM CLASS="ACRONYM" >GIN</ACRONYM > indexes was to create support for highly scalable full-text search in <SPAN CLASS="PRODUCTNAME" >PostgreSQL</SPAN >, and there are often situations when a full-text search returns a very large set of results. Moreover, this often happens when the query contains very frequent words, so that the large result set is not even useful. Since reading many tuples from the disk and sorting them could take a lot of time, this is unacceptable for production. (Note that the index search itself is very fast.) </P ><P > To facilitate controlled execution of such queries, <ACRONYM CLASS="ACRONYM" >GIN</ACRONYM > has a configurable soft upper limit on the number of rows returned: the <TT CLASS="VARNAME" >gin_fuzzy_search_limit</TT > configuration parameter. It is set to 0 (meaning no limit) by default. If a non-zero limit is set, then the returned set is a subset of the whole result set, chosen at random. </P ><P > <SPAN CLASS="QUOTE" >"Soft"</SPAN > means that the actual number of returned results could differ somewhat from the specified limit, depending on the query and the quality of the system's random number generator. </P ><P > From experience, values in the thousands (e.g., 5000 — 20000) work well. </P ></DD ></DL ></DIV ></DIV ><DIV CLASS="NAVFOOTER" ><HR ALIGN="LEFT" WIDTH="100%"><TABLE SUMMARY="Footer navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" ><A HREF="gin-implementation.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="index.html" ACCESSKEY="H" >Home</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" ><A HREF="gin-limit.html" ACCESSKEY="N" >Next</A ></TD ></TR ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" >Implementation</TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="gin.html" ACCESSKEY="U" >Up</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" >Limitations</TD ></TR ></TABLE ></DIV ></BODY ></HTML >