PostgreSQL批量操作数据库

批量操作可以减少数据库与应用程序的交互次数，提高数据处理的吞吐量。本文将通过示例介绍如何批量插入、更新和删除数据。

批量插入

使用 INSERT INTO … SELECT 的方法。

postgres=# INSERT INTO tbl1 (id, info ,crt_time) SELECT GENERATE_SERIES(1,10000),'test',NOW();    
INSERT 0 10000    
postgres=# SELECT COUNT(*) FROM tbl1;    
count     
-------    
10001    
(1 row)

使用 VALUES(),(),…(); 的方法

1 2	postgres=# INSERT INTO tbl1 (id,info,crt_time) VALUES (1,'test',NOW()), (2,'test2',NOW()), (3,'test3',NOW()); INSERT 0 3

使用 BEGIN; …多条INSERT…; END; 的方法。严格来说，这不属于批量，但可以减少事务提交时的同步等待，同样可以提升性能。

postgres=# BEGIN;    
BEGIN    
postgres=# INSERT INTO tbl1 (id,info,crt_time) VALUES (1,'test',NOW());    
INSERT 0 1    
postgres=# INSERT INTO tbl1 (id,info,crt_time) VALUES (2,'test2',NOW());    
INSERT 0 1    
postgres=# INSERT INTO tbl1 (id,info,crt_time) VALUES (3,'test3',NOW());    
INSERT 0 1    
postgres=# END;    
COMMIT

使用COPY协议。COPY协议与INSERT协议不一样，更加精简，插入效率高。

test03=# \d test  
              Table "public.test"  
Column  |            Type             | Modifiers   
----------+-----------------------------+-----------  
id       | integer                     | not null  
info     | text                        |   
crt_time | timestamp without time zone |   
Indexes:  
  "test_pkey" PRIMARY KEY, btree (id)  

test03=# COPY test FROM stdin;  
Enter data to be copied followed by a newline.  
End with a backslash and a period on a line by itself.  
>> 8    'test'  '2017-01-01'  
>> 9    'test9' '2017-02-02'  
>> \.  
COPY 2

批量更新数据

test03=# UPDATE test SET info=tmp.info from (VALUES (1,'new1'),(2,'new2'),(6,'new6')) AS tmp (id,info) WHERE test.id=tmp.id;  
UPDATE 3  
test03=# SELECT * FROM test;  
 id |     info     |          crt_time            
----+--------------+----------------------------  
  3 | hello        | 2017-04-24 15:31:49.14291  
  4 | digoal0123   | 2017-04-24 15:42:50.912887  
  5 | hello digoal | 2017-04-24 15:57:29.622045  
  1 | new1         | 2017-04-24 15:58:55.610072  
  2 | new2         | 2017-04-24 15:28:20.37392  
  6 | new6         | 2017-04-24 15:59:12.265915  
(6 rows)

批量删除数据

test03=# DELETE FROM test USING (VALUES (3),(4),(5)) AS tmp(id) WHERE test.id=tmp.id;  
DELETE 3  
test03=# SELECT * FROM test;  
 id |  info   |          crt_time            
----+---------+----------------------------  
  1 | new1    | 2017-04-24 15:58:55.610072  
  2 | new2    | 2017-04-24 15:28:20.37392  
  6 | new6    | 2017-04-24 15:59:12.265915

如果要清除全表，建议您使用TRUNCATE。

test03=# SET lock_timeout = '1s';
SET
test03=# TRUNCATE test;  
TRUNCATE TABLE  
test03=# SELECT * FROM test;  
 id | info | crt_time   
----+------+----------  
(0 rows)

批量更新的Mapper

两表不在同一个库的时候，先查询List后更新

@Update("<script>"+
            "update " + TABLE_NAME + " set cust_id = tmp.cust_id from (values" +
            "<foreach collection = 'borgtPols' item = 'item' index = 'index' separator = ','>"+
            "( #{item.polId}, #{item.insuCode}, #{item.custId} )" +
            "</foreach>" +
            ") as tmp (pol_id, insu_code, cust_id) where "+ TABLE_NAME +".insu_code = tmp.insu_code and "+ TABLE_NAME +".pol_id = tmp.pol_id"+
            "</script>")
void updateAllCustId(@Param("borgtPols") List<CoinsBorgtPol> coinsBorgtPol) throws DataAccessException;

直接连表更新：

@Update("update coins_borgt_pol set cust_id = tmp.cust_no from( select pol.pol_id, pol.insu_code, crm.cust_no 
		from coins_borgt_pol as pol 
		left join coins_crm_tb_bs_cust_base as crm on pol.cust_name = crm.cust_nm
		where pol.cust_id = '' and crm.cust_no != '' and crm.clr_date >= #{startDay} and 
		pol.clr_date >= #{startDay} and pol.clr_date <= #{endDay} ) as tmp (pol_id,insu_code,cust_no) where coins_borgt_pol.pol_id  = tmp.pol_id and coins_borgt_pol.insu_code")
void updateAllCustId(String startDay, String endDay) throws DataAccessException;