0%

PostgreSQL批量操作数据库

批量操作可以减少数据库与应用程序的交互次数,提高数据处理的吞吐量。本文将通过示例介绍如何批量插入、更新和删除数据。

批量插入

使用 INSERT INTO … SELECT 的方法。
1
2
3
4
5
6
7
postgres=# INSERT INTO tbl1 (id, info ,crt_time) SELECT GENERATE_SERIES(1,10000),'test',NOW();    
INSERT 0 10000
postgres=# SELECT COUNT(*) FROM tbl1;
count
-------
10001
(1 row)
使用 VALUES(),(),…(); 的方法
1
2
postgres=# INSERT INTO tbl1 (id,info,crt_time) VALUES (1,'test',NOW()), (2,'test2',NOW()), (3,'test3',NOW());    
INSERT 0 3
使用 BEGIN; …多条INSERT…; END; 的方法。严格来说,这不属于批量,但可以减少事务提交时的同步等待,同样可以提升性能。
1
2
3
4
5
6
7
8
9
10
postgres=# BEGIN;    
BEGIN
postgres=# INSERT INTO tbl1 (id,info,crt_time) VALUES (1,'test',NOW());
INSERT 0 1
postgres=# INSERT INTO tbl1 (id,info,crt_time) VALUES (2,'test2',NOW());
INSERT 0 1
postgres=# INSERT INTO tbl1 (id,info,crt_time) VALUES (3,'test3',NOW());
INSERT 0 1
postgres=# END;
COMMIT
使用COPY协议。COPY协议与INSERT协议不一样,更加精简,插入效率高。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
test03=# \d test  
Table "public.test"
Column | Type | Modifiers
----------+-----------------------------+-----------
id | integer | not null
info | text |
crt_time | timestamp without time zone |
Indexes:
"test_pkey" PRIMARY KEY, btree (id)

test03=# COPY test FROM stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 8 'test' '2017-01-01'
>> 9 'test9' '2017-02-02'
>> \.
COPY 2

批量更新数据

1
2
3
4
5
6
7
8
9
10
11
12
test03=# UPDATE test SET info=tmp.info from (VALUES (1,'new1'),(2,'new2'),(6,'new6')) AS tmp (id,info) WHERE test.id=tmp.id;  
UPDATE 3
test03=# SELECT * FROM test;
id | info | crt_time
----+--------------+----------------------------
3 | hello | 2017-04-24 15:31:49.14291
4 | digoal0123 | 2017-04-24 15:42:50.912887
5 | hello digoal | 2017-04-24 15:57:29.622045
1 | new1 | 2017-04-24 15:58:55.610072
2 | new2 | 2017-04-24 15:28:20.37392
6 | new6 | 2017-04-24 15:59:12.265915
(6 rows)

批量删除数据

1
2
3
4
5
6
7
8
test03=# DELETE FROM test USING (VALUES (3),(4),(5)) AS tmp(id) WHERE test.id=tmp.id;  
DELETE 3
test03=# SELECT * FROM test;
id | info | crt_time
----+---------+----------------------------
1 | new1 | 2017-04-24 15:58:55.610072
2 | new2 | 2017-04-24 15:28:20.37392
6 | new6 | 2017-04-24 15:59:12.265915

如果要清除全表,建议您使用TRUNCATE。

1
2
3
4
5
6
7
8
test03=# SET lock_timeout = '1s';
SET
test03=# TRUNCATE test;
TRUNCATE TABLE
test03=# SELECT * FROM test;
id | info | crt_time
----+------+----------
(0 rows)

批量更新的Mapper

两表不在同一个库的时候,先查询List后更新

1
2
3
4
5
6
7
8
@Update("<script>"+
"update " + TABLE_NAME + " set cust_id = tmp.cust_id from (values" +
"<foreach collection = 'borgtPols' item = 'item' index = 'index' separator = ','>"+
"( #{item.polId}, #{item.insuCode}, #{item.custId} )" +
"</foreach>" +
") as tmp (pol_id, insu_code, cust_id) where "+ TABLE_NAME +".insu_code = tmp.insu_code and "+ TABLE_NAME +".pol_id = tmp.pol_id"+
"</script>")
void updateAllCustId(@Param("borgtPols") List<CoinsBorgtPol> coinsBorgtPol) throws DataAccessException;

直接连表更新:

1
2
3
4
5
6
@Update("update coins_borgt_pol set cust_id = tmp.cust_no from( select pol.pol_id, pol.insu_code, crm.cust_no 
from coins_borgt_pol as pol
left join coins_crm_tb_bs_cust_base as crm on pol.cust_name = crm.cust_nm
where pol.cust_id = '' and crm.cust_no != '' and crm.clr_date >= #{startDay} and
pol.clr_date >= #{startDay} and pol.clr_date <= #{endDay} ) as tmp (pol_id,insu_code,cust_no) where coins_borgt_pol.pol_id = tmp.pol_id and coins_borgt_pol.insu_code")
void updateAllCustId(String startDay, String endDay) throws DataAccessException;