问题描述
Goal - To read csv file uploaded on google cloud storage bucket.
Environment - Run Jupyter notebook using SSH instance on Master node. Using python on Jupyter notebook trying to access a simple csv file uploaded onto google cloud storage bucket.
Approaches -
1st approach - Write a simple python program
Wrote following program
import csv
f = open('gs://python_test_hm/train.csv' , 'rb' )
csv_f = csv.reader(f)
for row in csv_f
print row
Results - Error message "No such file or directory"
2nd Approach - Using gcloud Package tried to access the train.csv file. The sample code is shown below. Below code is not the actual code. The file on google Cloud storage in my version of code was referred to "gs:///Filename.csv" Results - Error message "No such file or directory"
Load data from CSV
import csv
from gcloud import bigquery
from gcloud.bigquery import SchemaField
client = bigquery.Client()
dataset = client.dataset('dataset_name')
dataset.create() # API request
SCHEMA = [
SchemaField('full_name', 'STRING', mode='required'),
SchemaField('age', 'INTEGER', mode='required'),
]
table = dataset.table('table_name', SCHEMA)
table.create()
with open('csv_file', 'rb') as readable:
table.upload_from_file(
readable, source_format='CSV', skip_leading_rows=1)
3rd Approach -
import csv
import urllib
url = 'https://storage.cloud.google.com/<bucket>/train.csv'
response = urllib.urlopen(url)
cr = csv.reader(response)
print cr
for row in cr:
print row
Results - Above code doesn't result in any error but it displays the XML content of the google page as shown below. I am interested in viewing the data of the train csv file.
['<!DOCTYPE html>']
['<html lang="en">']
[' <head>']
[' <meta charset="utf-8">']
[' <meta content="width=300', ' initial-scale=1" name="viewport">']
[' <meta name="google-site-verification" content="LrdTUW9psUAMbh4Ia074- BPEVmcpBxF6Gwf0MSgQXZs">']
[' <title>Sign in - Google Accounts</title>']
Can someone throw some light on what could be possibly wrong here and how do I achieve my goal? Your help is highly appreciated.
Thanks so much for your help!
I assume you are using Jupyter notebook running on a machine in Google Cloud Platform (GCP)? If that's the case, you will already have the Google Cloud SDK running on that machine (by default).
With this setup you have 2 easy options to work with Google Cloud Storage (GCS):
Use the gcloud/gsutil commands in Jupyter
Writing to GCS:
gsutil cp train.csv gs://python_test_hm/train.csvReading from GCS:
gsutil cp gs://python_test_hm/train.csv train.csvUse google-cloud python library
Writing to GCS:
from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket('python_test_hm')
blob = bucket.blob('train.csv')
blob.upload_from_string('this is test content!')
Reading from GCS:
from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket('python_test_hm')
blob = storage.Blob('train.csv', bucket)
content = blob.download_as_string()
这篇关于无法读取上传到谷歌云存储桶的 csv 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!


大气响应式网络建站服务公司织梦模板
高端大气html5设计公司网站源码
织梦dede网页模板下载素材销售下载站平台(带会员中心带筛选)
财税代理公司注册代理记账网站织梦模板(带手机端)
成人高考自考在职研究生教育机构网站源码(带手机端)
高端HTML5响应式企业集团通用类网站织梦模板(自适应手机端)