OpenRefine | open source power tool for working with messy data
kandi X-RAY | OpenRefine Summary
kandi X-RAY | OpenRefine Summary
OpenRefine is a Java-based power tool that allows you to load data, understand it, clean it up, reconcile it, and augment it with data coming from the web. All from a web browser and the comfort and privacy of your own computer.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Parse a numeric token .
- Retrieves data from a post request .
- Returns the next token .
- Encode the main loop .
- Parse a factor .
- Gets the insert SQL .
- Gets the create sql .
- Export rows .
- Retrieves the data directory .
- Generate a serializable log event .
OpenRefine Key Features
OpenRefine Examples and Code Snippets
main :: IO ()
main = do
let x1 = allSubseqs2 [6,3,1,5,2,7,8,1]
print $ filter' ((==) (maximum (map' length x1)) . length) x1
longSubseqs values = do
let x1 = allSubseqs2 values
filter' ((==) (maximum (map'
public class Test
{
public static void main(String[] args)
{
System.out.println(countFileRecords());
}
package com;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.util.Scann
4.0.0
com.example
mongodb-javafx-demo
1.0-SNAPSHOT
mongo
UTF-8
18
org.openjfx
javafx-controls
${javafx.version}
---
title: "minimal reproducible example"
author: "User"
date: "April 2022"
output:
beamer_presentation:
keep_tex: true
bibliography: test.bib
header-includes:
- \AtBeginEnvironment{CSLReferences}{\tiny}
---
## Main question
H
String url = conn.getMetaData().getURL();
if (url.equals("jdbc:columnlist:connection")) {
SimpleResultSet rs = new SimpleResultSet();
// With some connection options "id" should be used instead
rs.addColumn("ID", Types.BIGINT,
class TrieNode {
constructor(data=null) {
this.children = {}; // Dictionary,
this.data = data; // Non-null when this node represents the end of a valid word
}
addWord(word, data) {
let node = this; // t
import java.util.Scanner;
import java.util.ArrayList;
public class MyClass {
public static void main(String args[]) {
Scanner sc = new Scanner(System.in); //Telling the scanner class to accept input from keyboard
#nowarn "9"
open System
open System.Runtime.InteropServices
open BenchmarkDotNet.Attributes
open BenchmarkDotNet.Running
open Microsoft.FSharp.NativeInterop
type ShortEventDataRec =
{
Timestamp: DateTime
Event: by
class HomeScreen00 extends StatefulWidget {
@override
_HomeScreen00State createState() => _HomeScreen00State();
}
class _HomeScreen00State extends State {
List myIds = [];
List myServiceNames = [];
List myImagesUrl = [];
bo
export 'string_extension.dart';
export 'list_extension.dart';
abstract class Extensions {}
Community Discussions
Trending Discussions on OpenRefine
QUESTION
I am using Triple Store called Apache Jena Fuseki for storing the RDF as input But the thing is that i have data in CSV format. I researched a lot but didn't find direct way to convert CSV to RDF but there is tarql tool which is command line tool that can do the job but the thing is that i need a python script that directly converts my CSV to RDF form.
I have used the tools like openRefine and tarql but i need python script to do this job and i have read somewhere that owlready2 tool also used to convert CSV to RDF but when i used to visit the official site then i found that they are using OWL file for this work.
Thanks!
...ANSWER
Answered 2022-Feb-05 at 16:46CSVW - CSV on the Web - is a W3C Recommendation for this. There is a python implementation.
Or you can run "tarql" from python by forking a subprocess.
QUESTION
I'd like to write my own OpenRefine extension
Before starting any implementation, I just want to build the sample extension from OpenRefine just to get me started.
However, I'm getting the Maven error
...ANSWER
Answered 2021-Dec-30 at 15:25ok, I think the sample project has a wrong version in the pom.xml. it should be\
QUESTION
I have my own Python library that I would like to use in OpenRefine as described here
However, it seems that all the Python code in OpenRefine goes through Jython which supports only Python 2
Is there a way to run Python3 code in OpenRefine?
cheers
...ANSWER
Answered 2021-Dec-29 at 06:36Short answer: NO. Openrefine uses Jython, which is currently based on python 2.7 and there is no immediate or short term plans to move to 3.X versions.
BUT.
There is a trick to do this, as soon as you have python3 installed on your machine. Python2 allows the execution of a command-line script/tool, and collecting the result.
This simple python2 script will do that :
QUESTION
I have a column of values with a range of dates formatted as DD Month YYYY, but I want this to read Month DD YYYY. So, for example "14 October 2021" should be "October 14 2021" - is there is a simple way to do this in OpenRefine?
Thank you!
...ANSWER
Answered 2021-Oct-14 at 21:36From a google search it looks like there is a python library called Jython. If you install it, you could try.
QUESTION
I'm trying to add a column based on a column in OpenRefine using GREL.
I need to extract every text after the second space in scientific name.
Here is two examples of the original cell data ---> what I want to extract:
...Amandinea punctata (Hoffm.) Coppins & Scheid. ---> (Hoffm.) Coppins & Scheid. Agonimia tristicula (Nyl.) Zahlbr. ---> (Nyl.) Zahlbr.
ANSWER
Answered 2021-Aug-31 at 14:58A solution : partition on what appears to be a good separator : " (", take the right part and add a missing "(" at the beginning.
QUESTION
I have a tei listPerson
...ANSWER
Answered 2021-Jul-01 at 13:35With XSLT 2 or 3, I usually prefer to use xsl:value-of separator
to construct the lines of CSV e.g.
QUESTION
Let's suppose I have this list in OpenRefine:
- A
- B
- C
Is there a way to move (offset values) B to A like the following?
- A B
- B C
ANSWER
Answered 2021-May-31 at 08:08With the cross()
function, and v3.5 of OpenRefine (currently in beta) you can access previous or following rows by not supplying the field name. You can achieve the same by creating an index column in v3.4.
So, you can do cells.ColumnName.value +" "+ cross(row.index + 1, "", "")[0].cells.ColumnName.value
to get the value of the next row appending the value of that cell in the current row, with a space.
Note that this will take the value of the row with an index higher, not necessally the row following in the display, if you use sorting.
Regards, Antoine
QUESTION
I have a bunch of product data to clean prior to entry into a database that looks like this:
COL A COL B COL C... "N" Option 1 A, B, C, D Option 1 attribute Option 2 C, D, F Option 2 attribute Option 3 D, J, Z Option 3 attributeAnd I'd like for it to look like this with a unique row for every unique product option:
COL A COL B COL C... "N" Option 1 A Option 1 attribute Option 1 B Option 1 attribute Option 1 C Option 1 attribute Option 1 D Option 1 attribute Option 2 C Option 2 attribute Option 2 D Option 2 attribute Option 2 F Option 2 attribute Option 3 D Option 3 attribute Option 3 J Option 3 attribute Option 3 Z Option 3 attributeI understand how I could do this with a python script, but I am already using OpenRefine, and I am hoping not to involve a whole new process to my data flow.
Is there an easy way to do this in OpenRefine? I am having a hard time finding a method or extensions for something like this.
Thanks!
EDIT
@magdmartin How can you fill down blank cells using delineated values from the first cell?
COL A COL B COL C... "N" Option 1 A,B,C,D Option 1 attribute Option 1 Option 1 attribute Option 1 Option 1 attribute Option 1 Option 1 attribute Option 2 C,D,F Option 2 attribute Option 2 Option 2 attribute Option 2 Option 2 attribute Option 3 D,J,Z Option 3 attribute Option 3 Option 3 attribute Option 3 Option 3 attributeTurned into
COL A COL B COL C... "N" Option 1 A Option 1 attribute Option 1 B Option 1 attribute Option 1 C Option 1 attribute Option 1 D Option 1 attribute Option 2 C Option 2 attribute Option 2 D Option 2 attribute Option 2 F Option 2 attribute Option 3 D Option 3 attribute Option 3 J Option 3 attribute Option 3 Z Option 3 attributeThanks!
...ANSWER
Answered 2021-May-26 at 02:31I recorded a video here walking through each options describe below here: https://youtu.be/3194zXoJtqI
For this project, you will need to use two OpenRefine functions
- Split multi-valued cells on
COL B
to create one new line for each value separated by comma - Fill down and blank down to fill down the value in the other column from the menu.
If you have a lot of columns you can use the All > Transform
to speed up the process with the following expression row.record.cells[columnName].value[0]
. The trick here is to fill down Col A
last so we can keep the record mode when filling down other column (see screenshot below)
QUESTION
I have a csv of names like so Smith, SMITH, John, JOHN
and I'm trying to use regex in OpenRefine to remove the names in all caps.
replace(value, /^[A-Z]$/, '')
does nothing and replace(value, /[A-Z]/, '')
gets rid of all names with any capital letters and leaves a trail of stray commas.
I need to delete the all caps names and any commas that may follow as well. I'm not interested in preserving the list by making all names lower case or capitalizing the first letter of each name. Any name in all caps must be deleted.
...ANSWER
Answered 2021-Mar-23 at 22:09Use
QUESTION
love OpenRefine and how easy it is to use, just been looking into the Extract / Apply bit and this would come in really useful for what I use OpenRefine for. I was hoping that it would be able to use wild cards to match a pattern in the apply section.
So in the example below, I have a new column called Cluster and in there there are items which will be
...ANSWER
Answered 2021-Feb-25 at 16:23First of all, I need to warn you that the Extract Operations / Apply Operations facility is not fully developed has a number of limitations if used on anything other than the original data.
Anything that ends up being recorded as a mass-edit
is unlikely to be useful for replaying on different data. For this use case, I'd suggest using something like the replace
function with a regex pattern as the string to be replaced, so something like:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install OpenRefine
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page