使用 csv 文件中的 copy_from 到 Postgres db 时,Psycopg2 不会自动生成 id

2023-08-29Python开发问题
4

本文介绍了使用 csv 文件中的 copy_from 到 Postgres db 时,Psycopg2 不会自动生成 id的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我有一个包含多列的 csv 文件:

I have a csv file that has several columns:

upc 日期数量客户

在我的 physical 表中,每行都有一个自动生成的 id 列:

In my physical table, I have an auto generating id column for each row:

id upc 日期数量客户

当我运行 python 脚本复制到数据库时,数据库似乎将 upc 解释为实际 id.我收到此错误消息:

It seems as though the db is interpreting the upc as the actual id when I run my python script to copy into the db. I'm getting this error message:

Error: value "1111111" is out of range for type integer
CONTEXT:  COPY physical, line 1, column id: "1111111"

我以前从未尝试过,但我相信这是正确的:

I've never attempted this before, but I believe this is correct:

def insert_csv(f, table):
    connection = get_postgres_connection()
    cursor = connection.cursor()
    try:
        cursor.copy_from(f, table, sep=',')
        connection.commit()
        return True
    except (psycopg2.Error) as e:
        print(e)
        return False
    finally:
        cursor.close()
        connection.close()

我在这里做错了什么,还是我必须创建另一个脚本才能从表中获取最后一个 id?

Am I doing something wrong here, or do I have to create another script to get the last id from the table?

更新的工作代码:

def insert_csv(f, table, columns):
    connection = get_postgres_connection()
    cursor = connection.cursor()
    try:
        column_names = ','.join(columns)
        query = f'''
            COPY {table}({column_names})
            FROM STDOUT (FORMAT CSV)
        '''
        cursor.copy_expert(query, f)
        connection.commit()
        return True
    except (psycopg2.Error) as e:
        print(e)
        return False
    finally:
        cursor.close()
        connection.close()

columns = (
        "upc",
        "date_thru",
        "transaction_type",
        "transaction_type_subtype",
        "country_code",
        "customer",
        "quantity",
        "income_gross",
        "fm_serial",
        "date_usage"
    )

with open(dump_file, 'r', newline='', encoding="ISO-8859-1") as f:
        inserted = insert_csv(f, 'physical', columns)

推荐答案

您需要指定要导入的列.来自文档:

You need to specify columns to import. From the documentation:

columns – 可与要导入的列的名称进行迭代.长度和类型应与要读取的文件的内容相匹配.如果未指定,则假定整个表与文件结构匹配.

columns – iterable with name of the columns to import. The length and types should match the content of the file to read. If not specified, it is assumed that the entire table matches the file structure.

您的代码可能如下所示:

Your code may look like this:

def insert_csv(f, table, columns):
    connection = connect()
    cursor = connection.cursor()
    try:
        cursor.copy_from(f, table, sep=',', columns=columns)
        connection.commit()
        return True
    except (psycopg2.Error) as e:
        print(e)
        return False
    finally:
        cursor.close()
        connection.close()
        
with open("path_to_my_csv") as file:
    insert_csv(file, "my_table", ("upc", "date", "quantity", "customer"))

如果您必须使用 copy_expert(),请按以下方式修改您的函数:

If you have to use copy_expert() modify your function in the way as follow:

def insert_csv(f, table, columns):
    connection = connect()
    cursor = connection.cursor()
    try:
        column_names = ','.join(columns)
        copy_cmd = f"copy {table}({column_names}) from stdout (format csv)"
        cursor.copy_expert(copy_cmd, f)
        connection.commit()
        return True
    except (psycopg2.Error) as e:
        print(e)
        return False
    finally:
        cursor.close()
        connection.close()

这篇关于使用 csv 文件中的 copy_from 到 Postgres db 时,Psycopg2 不会自动生成 id的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

The End

相关推荐

在xarray中按单个维度的多个坐标分组
groupby multiple coords along a single dimension in xarray(在xarray中按单个维度的多个坐标分组)...
2024-08-22 Python开发问题
15

Pandas中的GROUP BY AND SUM不丢失列
Group by and Sum in Pandas without losing columns(Pandas中的GROUP BY AND SUM不丢失列)...
2024-08-22 Python开发问题
17

GROUP BY+新列+基于条件的前一行抓取值
Group by + New Column + Grab value former row based on conditionals(GROUP BY+新列+基于条件的前一行抓取值)...
2024-08-22 Python开发问题
18

PANDA中的Groupby算法和插值算法
Groupby and interpolate in Pandas(PANDA中的Groupby算法和插值算法)...
2024-08-22 Python开发问题
11

PANAS-基于列对行进行分组,并将NaN替换为非空值
Pandas - Group Rows based on a column and replace NaN with non-null values(PANAS-基于列对行进行分组,并将NaN替换为非空值)...
2024-08-22 Python开发问题
10

按10分钟间隔对 pandas 数据帧进行分组
Grouping pandas DataFrame by 10 minute intervals(按10分钟间隔对 pandas 数据帧进行分组)...
2024-08-22 Python开发问题
11