Support
Quality
Security
License
Reuse
kandi has reviewed yoke and discovered the below as its top functions. This is intended to give you an instant insight into yoke implemented functionality, and help decide if they suit your requirements.
Get all kandi verified functions for this library.
Get all kandi verified functions for this library.
Yoke is a middleware framework for Vert.x
QUESTION
Regex exclude rows in csv that has forbidding word
Asked 2022-Feb-12 at 08:18I've been trying to exclude all the rows that contain 'shirt' and then from that have the rows that have 'cotton' (case insensitive)
for example:
"Cotton Shirt for sale" - don't include
"Cotton Dress for Sale" - Pass
"dress shirt-V-neck-cotton" -fail
"no words relevant" - Fail (no cotton in it)
"cotton-url click" - pass
My regex:
pattern = re.compile('(?i)^((?!.*shirt).).*(?=.*cotton.*)')
But for some reason my rows in csv still remain on a sentence:
"Stone Italian Yarn Fringe Yoke Cable Cotton Shirt New Look"
my code:
pattern1 = re.compile("(?i)(.*shirt.*)")
with open("sample.csv", 'r', encoding="utf-8") as bigCSV:
csv_reader = csv.reader(bigCSV)
counterWithout = 0
counterCheck = 0
headFlag = True
for row in csv_reader:
if headFlag:
header = row
headFlag = False
if any(pattern.match(line) for line in row)://there is a difference in the number of rows here
if any(pattern1.match(line) for line in row):
print(row)
counterCheck += 1
counterWithout += 1
Help fix regex please
ANSWER
Answered 2022-Feb-11 at 10:08You can use .*(cotton)?.*shirt.*|.*shirt.*(cotton)?.*
It will match every shirt with condition of cotton before or after it. So you can delete every row that satisfy this.
import re
import csv
header = ['name', 'val', 'test']
data = [
["Cotton Dress for Sale", 1, 'test1'],
["dress shirt-V-neck-cotton", 2, 'test2'],
["no words relevant", 3, 'test3'],
["Cotton Shirt for sale", 4, 'test4'],
["Stone Italian Yarn Fringe Yoke Cable Cotton Shirt New Look", 5, 'test5'],
["cotton-url click", 6, 'test6'],
]
# with open('test.csv', 'w', encoding='UTF8', newline='') as f:
# writer = csv.writer(f)
# # write the header
# writer.writerow(header)
# # write multiple rows
# writer.writerows(data)
regex = re.compile(r'.*(cotton)?.*shirt.*|.*shirt.*(cotton)?.*',
re.IGNORECASE)
with open('test.csv', 'r', encoding='UTF8', newline='') as f:
reader = csv.reader(f)
headFlag = True
counterWithout = 0
counterCheck = 0
for row in reader:
if headFlag:
header = row
headFlag = False
else:
if not any(regex.match(x) for x in row):
counterCheck += 1
print(row)
else:
counterWithout += 1
print(f'{counterCheck=}, {counterWithout=}')
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
No vulnerabilities reported
Save this library and start creating your kit
See Similar Libraries in
Save this library and start creating your kit
Open Weaver – Develop Applications Faster with Open Source