Popular New Releases in Wiki
outline
v0.63.0
BookStack
BookStack v22.03.1
penrose
v1.3.0
kb
v0.1.6
wikiextractor
v3.0.4
Popular Libraries in Wiki
by houshanren css
25626
2017年买房经历总结出来的买房购房知识分享给大家,希望对大家有所帮助。买房不易,且买且珍惜。Sharing the knowledge of buy an own house that according to the experience at hangzhou in 2017 to all the people. It's not easy to buy a own house, so I hope that it would be useful to everyone.
by outline typescript
15149 NOASSERTION
The fastest wiki and knowledge base for growing teams. Beautiful, feature rich, and markdown compatible.
by gollum ruby
12377 MIT
A simple, Git-powered wiki with a sweet API and local frontend.
by BookStackApp php
9107 NOASSERTION
A platform to create documentation/wiki content built with PHP & Laravel
by HannahMitt java
7907 Apache-2.0
Android application powering the mirror in my house
by Jermolene javascript
6684 NOASSERTION
A self-contained JavaScript wiki for the browser, Node.js, AWS Lambda etc.
by penrose typescript
5350 MIT
Create beautiful diagrams just by typing mathematical notation in plain text.
by splitbrain php
3359 GPL-2.0
The DokuWiki Open Source Wiki Engine
by nikitavoloboev javascript
3301
Everything I know
Trending New libraries in Wiki
by gnebbia python
2726 GPL-3.0
A minimalist command line knowledge base manager
by CollaboraOnline javascript
646 NOASSERTION
Collabora Online is a collaborative online office suite based on LibreOffice technology. This is also the source for the Collabora Office apps for iOS and Android.
by notion-enhancer shell
471 MIT
notion executables with the notion-enhancer embedded & a vanilla port of the official app to linux
by GauravSingh9356 python
387 MIT
Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking, Weather Reporting i.e. temp, wind speed, humidity, YouTube searching, Google Map searching, Youtube Downloading, etc.
by joekroese html
387 BSD-2-Clause
Your open source external brain
by chrisvel javascript
346 AGPL-3.0
Wreeto is an open source note-taking, knowledge management and wiki system.
by zverok python
331 MIT
Query language for efficient data extraction from Wikipedia
by mickael-menu go
314 GPL-3.0
A plain text note-taking assistant
by daveshap python
183 MIT
Convert Wikipedia database dumps into plaintext files
Top Authors in Wiki
1
297 Libraries
9397
2
13 Libraries
41
3
11 Libraries
88
4
10 Libraries
494
5
7 Libraries
26
6
7 Libraries
57
7
7 Libraries
1230
8
7 Libraries
131
9
6 Libraries
73
10
6 Libraries
190
1
297 Libraries
9397
2
13 Libraries
41
3
11 Libraries
88
4
10 Libraries
494
5
7 Libraries
26
6
7 Libraries
57
7
7 Libraries
1230
8
7 Libraries
131
9
6 Libraries
73
10
6 Libraries
190
Trending Kits in Wiki
No Trending Kits are available at this moment for Wiki
Trending Discussions on Wiki
CentOS through a VM - no URLs in mirrorlist
Repeatedly removing the maximum average subarray
Under what notion of equality are typeclass laws written?
Is there an identity index value in JavaScript?
Error [ERR_REQUIRE_ESM]: require() of ES Module not supported
Log4j vulnerability - Is Log4j 1.2.17 vulnerable (was unable to find any JNDI code in source)?
Why is QuackSort 2x faster than Data.List's sort for random lists?
Resource linking fails on lStar
Bubble sort slower with -O3 than -O2 with GCC
Efficient summation in Python
QUESTION
CentOS through a VM - no URLs in mirrorlist
Asked 2022-Mar-26 at 21:04I am trying to run a CentOS 8 server through VirtualBox (6.1.30) (Vagrant), which worked just fine yesterday for me, but today I tried running a sudo yum update
. I keep getting this error for some reason:
1[vagrant@192.168.38.4] ~ >> sudo yum update
2CentOS Linux 8 - AppStream 71 B/s | 38 B 00:00
3Error: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
4
I already tried to change the namespaces on /etc/resolve.conf
, remove the DNF folders and everything. On other computers, this works just fine, so I think the problem is with my host machine. I also tried to reset the network settings (I am on a Windows 10 host), without success either. It's not a DNS problem; it works just fine.
After I reinstalled Windows, I still have the same error in my VM.
File dnf.log:
1[vagrant@192.168.38.4] ~ >> sudo yum update
2CentOS Linux 8 - AppStream 71 B/s | 38 B 00:00
3Error: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
42022-01-31T15:28:03+0000 INFO --- logging initialized ---
52022-01-31T15:28:03+0000 DDEBUG timer: config: 2 ms
62022-01-31T15:28:03+0000 DEBUG Loaded plugins: builddep, changelog, config-manager, copr, debug, debuginfo-install, download, generate_completion_cache, groups-manager, needs-restarting, playground, repoclosure, repodiff, repograph, repomanage, reposync
72022-01-31T15:28:03+0000 DEBUG YUM version: 4.4.2
82022-01-31T15:28:03+0000 DDEBUG Command: yum update
92022-01-31T15:28:03+0000 DDEBUG Installroot: /
102022-01-31T15:28:03+0000 DDEBUG Releasever: 8
112022-01-31T15:28:03+0000 DEBUG cachedir: /var/cache/dnf
122022-01-31T15:28:03+0000 DDEBUG Base command: update
132022-01-31T15:28:03+0000 DDEBUG Extra commands: ['update']
142022-01-31T15:28:03+0000 DEBUG User-Agent: constructed: 'libdnf (CentOS Linux 8; generic; Linux.x86_64)'
152022-01-31T15:28:05+0000 DDEBUG Cleaning up.
162022-01-31T15:28:05+0000 SUBDEBUG
17Traceback (most recent call last):
18 File "/usr/lib/python3.6/site-packages/dnf/repo.py", line 574, in load
19 ret = self._repo.load()
20 File "/usr/lib64/python3.6/site-packages/libdnf/repo.py", line 397, in load
21 return _repo.Repo_load(self)
22libdnf._error.Error: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
23
24During handling of the above exception, another exception occurred:
25
26Traceback (most recent call last):
27 File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 67, in main
28 return _main(base, args, cli_class, option_parser_class)
29 File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 106, in _main
30 return cli_run(cli, base)
31 File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 122, in cli_run
32 cli.run()
33 File "/usr/lib/python3.6/site-packages/dnf/cli/cli.py", line 1050, in run
34 self._process_demands()
35 File "/usr/lib/python3.6/site-packages/dnf/cli/cli.py", line 740, in _process_demands
36 load_available_repos=self.demands.available_repos)
37 File "/usr/lib/python3.6/site-packages/dnf/base.py", line 394, in fill_sack
38 self._add_repo_to_sack(r)
39 File "/usr/lib/python3.6/site-packages/dnf/base.py", line 137, in _add_repo_to_sack
40 repo.load()
41 File "/usr/lib/python3.6/site-packages/dnf/repo.py", line 581, in load
42 raise dnf.exceptions.RepoError(str(e))
43dnf.exceptions.RepoError: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
442022-01-31T15:28:05+0000 CRITICAL Error: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
45
ANSWER
Answered 2022-Mar-26 at 20:59Check out this article: CentOS Linux EOL
The below commands helped me:
1[vagrant@192.168.38.4] ~ >> sudo yum update
2CentOS Linux 8 - AppStream 71 B/s | 38 B 00:00
3Error: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
42022-01-31T15:28:03+0000 INFO --- logging initialized ---
52022-01-31T15:28:03+0000 DDEBUG timer: config: 2 ms
62022-01-31T15:28:03+0000 DEBUG Loaded plugins: builddep, changelog, config-manager, copr, debug, debuginfo-install, download, generate_completion_cache, groups-manager, needs-restarting, playground, repoclosure, repodiff, repograph, repomanage, reposync
72022-01-31T15:28:03+0000 DEBUG YUM version: 4.4.2
82022-01-31T15:28:03+0000 DDEBUG Command: yum update
92022-01-31T15:28:03+0000 DDEBUG Installroot: /
102022-01-31T15:28:03+0000 DDEBUG Releasever: 8
112022-01-31T15:28:03+0000 DEBUG cachedir: /var/cache/dnf
122022-01-31T15:28:03+0000 DDEBUG Base command: update
132022-01-31T15:28:03+0000 DDEBUG Extra commands: ['update']
142022-01-31T15:28:03+0000 DEBUG User-Agent: constructed: 'libdnf (CentOS Linux 8; generic; Linux.x86_64)'
152022-01-31T15:28:05+0000 DDEBUG Cleaning up.
162022-01-31T15:28:05+0000 SUBDEBUG
17Traceback (most recent call last):
18 File "/usr/lib/python3.6/site-packages/dnf/repo.py", line 574, in load
19 ret = self._repo.load()
20 File "/usr/lib64/python3.6/site-packages/libdnf/repo.py", line 397, in load
21 return _repo.Repo_load(self)
22libdnf._error.Error: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
23
24During handling of the above exception, another exception occurred:
25
26Traceback (most recent call last):
27 File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 67, in main
28 return _main(base, args, cli_class, option_parser_class)
29 File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 106, in _main
30 return cli_run(cli, base)
31 File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 122, in cli_run
32 cli.run()
33 File "/usr/lib/python3.6/site-packages/dnf/cli/cli.py", line 1050, in run
34 self._process_demands()
35 File "/usr/lib/python3.6/site-packages/dnf/cli/cli.py", line 740, in _process_demands
36 load_available_repos=self.demands.available_repos)
37 File "/usr/lib/python3.6/site-packages/dnf/base.py", line 394, in fill_sack
38 self._add_repo_to_sack(r)
39 File "/usr/lib/python3.6/site-packages/dnf/base.py", line 137, in _add_repo_to_sack
40 repo.load()
41 File "/usr/lib/python3.6/site-packages/dnf/repo.py", line 581, in load
42 raise dnf.exceptions.RepoError(str(e))
43dnf.exceptions.RepoError: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
442022-01-31T15:28:05+0000 CRITICAL Error: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
45sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-Linux-*
46sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-Linux-*
47
Doing this will make DNF work, but you will no longer receive any updates.
To upgrade to CentOS 8 stream:
1[vagrant@192.168.38.4] ~ >> sudo yum update
2CentOS Linux 8 - AppStream 71 B/s | 38 B 00:00
3Error: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
42022-01-31T15:28:03+0000 INFO --- logging initialized ---
52022-01-31T15:28:03+0000 DDEBUG timer: config: 2 ms
62022-01-31T15:28:03+0000 DEBUG Loaded plugins: builddep, changelog, config-manager, copr, debug, debuginfo-install, download, generate_completion_cache, groups-manager, needs-restarting, playground, repoclosure, repodiff, repograph, repomanage, reposync
72022-01-31T15:28:03+0000 DEBUG YUM version: 4.4.2
82022-01-31T15:28:03+0000 DDEBUG Command: yum update
92022-01-31T15:28:03+0000 DDEBUG Installroot: /
102022-01-31T15:28:03+0000 DDEBUG Releasever: 8
112022-01-31T15:28:03+0000 DEBUG cachedir: /var/cache/dnf
122022-01-31T15:28:03+0000 DDEBUG Base command: update
132022-01-31T15:28:03+0000 DDEBUG Extra commands: ['update']
142022-01-31T15:28:03+0000 DEBUG User-Agent: constructed: 'libdnf (CentOS Linux 8; generic; Linux.x86_64)'
152022-01-31T15:28:05+0000 DDEBUG Cleaning up.
162022-01-31T15:28:05+0000 SUBDEBUG
17Traceback (most recent call last):
18 File "/usr/lib/python3.6/site-packages/dnf/repo.py", line 574, in load
19 ret = self._repo.load()
20 File "/usr/lib64/python3.6/site-packages/libdnf/repo.py", line 397, in load
21 return _repo.Repo_load(self)
22libdnf._error.Error: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
23
24During handling of the above exception, another exception occurred:
25
26Traceback (most recent call last):
27 File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 67, in main
28 return _main(base, args, cli_class, option_parser_class)
29 File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 106, in _main
30 return cli_run(cli, base)
31 File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 122, in cli_run
32 cli.run()
33 File "/usr/lib/python3.6/site-packages/dnf/cli/cli.py", line 1050, in run
34 self._process_demands()
35 File "/usr/lib/python3.6/site-packages/dnf/cli/cli.py", line 740, in _process_demands
36 load_available_repos=self.demands.available_repos)
37 File "/usr/lib/python3.6/site-packages/dnf/base.py", line 394, in fill_sack
38 self._add_repo_to_sack(r)
39 File "/usr/lib/python3.6/site-packages/dnf/base.py", line 137, in _add_repo_to_sack
40 repo.load()
41 File "/usr/lib/python3.6/site-packages/dnf/repo.py", line 581, in load
42 raise dnf.exceptions.RepoError(str(e))
43dnf.exceptions.RepoError: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
442022-01-31T15:28:05+0000 CRITICAL Error: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
45sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-Linux-*
46sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-Linux-*
47sudo dnf install centos-release-stream -y
48sudo dnf swap centos-{linux,stream}-repos -y
49sudo dnf distro-sync -y
50
Optionally reboot if your kernel updated (not needed in containers).
QUESTION
Repeatedly removing the maximum average subarray
Asked 2022-Feb-28 at 18:19I have an array of positive integers. For example:
1[1, 7, 8, 4, 2, 1, 4]
2
A "reduction operation" finds the array prefix with the highest average, and deletes it. Here, an array prefix means a contiguous subarray whose left end is the start of the array, such as [1]
or [1, 7]
or [1, 7, 8]
above. Ties are broken by taking the longer prefix.
1[1, 7, 8, 4, 2, 1, 4]
2Original array: [ 1, 7, 8, 4, 2, 1, 4]
3
4Prefix averages: [1.0, 4.0, 5.3, 5.0, 4.4, 3.8, 3.9]
5
6-> Delete [1, 7, 8], with maximum average 5.3
7-> New array -> [4, 2, 1, 4]
8
I will repeat the reduction operation until the array is empty:
1[1, 7, 8, 4, 2, 1, 4]
2Original array: [ 1, 7, 8, 4, 2, 1, 4]
3
4Prefix averages: [1.0, 4.0, 5.3, 5.0, 4.4, 3.8, 3.9]
5
6-> Delete [1, 7, 8], with maximum average 5.3
7-> New array -> [4, 2, 1, 4]
8[1, 7, 8, 4, 2, 1, 4]
9^ ^
10[4, 2, 1, 4]
11^ ^
12[2, 1, 4]
13^ ^
14[]
15
Now, actually performing these array modifications isn't necessary; I'm only looking for the list of lengths of prefixes that would be deleted by this process, for example, [3, 1, 3]
above.
What is an efficient algorithm for computing these prefix lengths?
The naive approach is to recompute all sums and averages from scratch in every iteration for an O(n^2)
algorithm-- I've attached Python code for this below. I'm looking for any improvement on this approach-- most preferably, any solution below O(n^2)
, but an algorithm with the same complexity but better constant factors would also be helpful.
Here are a few of the things I've tried (without success):
- Dynamically maintaining prefix sums, for example with a Binary Indexed Tree. While I can easily update prefix sums or find a maximum prefix sum in
O(log n)
time, I haven't found any data structure which can update the average, as the denominator in the average is changing. - Reusing the previous 'rankings' of prefix averages-- these rankings can change, e.g. in some array, the prefix ending at index
5
may have a larger average than the prefix ending at index6
, but after removing the first 3 elements, now the prefix ending at index2
may have a smaller average than the one ending at3
. - Looking for patterns in where prefixes end; for example, the rightmost element of any max average prefix is always a local maximum in the array, but it's not clear how much this helps.
This is a working Python implementation of the naive, quadratic method:
1[1, 7, 8, 4, 2, 1, 4]
2Original array: [ 1, 7, 8, 4, 2, 1, 4]
3
4Prefix averages: [1.0, 4.0, 5.3, 5.0, 4.4, 3.8, 3.9]
5
6-> Delete [1, 7, 8], with maximum average 5.3
7-> New array -> [4, 2, 1, 4]
8[1, 7, 8, 4, 2, 1, 4]
9^ ^
10[4, 2, 1, 4]
11^ ^
12[2, 1, 4]
13^ ^
14[]
15from fractions import Fraction
16def find_array_reductions(nums: List[int]) -> List[int]:
17 """Return list of lengths of max average prefix reductions."""
18
19 def max_prefix_avg(arr: List[int]) -> Tuple[float, int]:
20 """Return value and length of max average prefix in arr."""
21 if len(arr) == 0:
22 return (-math.inf, 0)
23
24 best_length = 1
25 best_average = Fraction(0, 1)
26 running_sum = 0
27
28 for i, x in enumerate(arr, 1):
29 running_sum += x
30 new_average = Fraction(running_sum, i)
31 if new_average >= best_average:
32 best_average = new_average
33 best_length = i
34
35 return (float(best_average), best_length)
36
37 removed_lengths = []
38 total_removed = 0
39
40 while total_removed < len(nums):
41 _, new_removal = max_prefix_avg(nums[total_removed:])
42 removed_lengths.append(new_removal)
43 total_removed += new_removal
44
45 return removed_lengths
46
Edit: The originally published code had a rare error with large inputs from using Python's math.isclose()
with default parameters for floating point comparison, rather than proper fraction comparison. This has been fixed in the current code. An example of the error can be found at this Try it online link, along with a foreword explaining exactly what causes this bug, if you're curious.
ANSWER
Answered 2022-Feb-27 at 22:44This problem has a fun O(n) solution.
If you draw a graph of cumulative sum vs index, then:
The average value in the subarray between any two indexes is the slope of the line between those points on the graph.
The first highest-average-prefix will end at the point that makes the highest angle from 0. The next highest-average-prefix must then have a smaller average, and it will end at the point that makes the highest angle from the first ending. Continuing to the end of the array, we find that...
These segments of highest average are exactly the segments in the upper convex hull of the cumulative sum graph.
Find these segments using the monotone chain algorithm. Since the points are already sorted, it takes O(n) time.
1[1, 7, 8, 4, 2, 1, 4]
2Original array: [ 1, 7, 8, 4, 2, 1, 4]
3
4Prefix averages: [1.0, 4.0, 5.3, 5.0, 4.4, 3.8, 3.9]
5
6-> Delete [1, 7, 8], with maximum average 5.3
7-> New array -> [4, 2, 1, 4]
8[1, 7, 8, 4, 2, 1, 4]
9^ ^
10[4, 2, 1, 4]
11^ ^
12[2, 1, 4]
13^ ^
14[]
15from fractions import Fraction
16def find_array_reductions(nums: List[int]) -> List[int]:
17 """Return list of lengths of max average prefix reductions."""
18
19 def max_prefix_avg(arr: List[int]) -> Tuple[float, int]:
20 """Return value and length of max average prefix in arr."""
21 if len(arr) == 0:
22 return (-math.inf, 0)
23
24 best_length = 1
25 best_average = Fraction(0, 1)
26 running_sum = 0
27
28 for i, x in enumerate(arr, 1):
29 running_sum += x
30 new_average = Fraction(running_sum, i)
31 if new_average >= best_average:
32 best_average = new_average
33 best_length = i
34
35 return (float(best_average), best_length)
36
37 removed_lengths = []
38 total_removed = 0
39
40 while total_removed < len(nums):
41 _, new_removal = max_prefix_avg(nums[total_removed:])
42 removed_lengths.append(new_removal)
43 total_removed += new_removal
44
45 return removed_lengths
46# Lengths of the segments in the upper convex hull
47# of the cumulative sum graph
48def upperSumHullLengths(arr):
49 if len(arr) < 2:
50 if len(arr) < 1:
51 return []
52 else:
53 return [1]
54
55 hull = [(0, 0),(1, arr[0])]
56 for x in range(2, len(arr)+1):
57 # this has x coordinate x-1
58 prevPoint = hull[len(hull) - 1]
59 # next point in cumulative sum
60 point = (x, prevPoint[1] + arr[x-1])
61 # remove points not on the convex hull
62 while len(hull) >= 2:
63 p0 = hull[len(hull)-2]
64 dx0 = prevPoint[0] - p0[0]
65 dy0 = prevPoint[1] - p0[1]
66 dx1 = x - prevPoint[0]
67 dy1 = point[1] - prevPoint[1]
68 if dy1*dx0 < dy0*dx1:
69 break
70 hull.pop()
71 prevPoint = p0
72 hull.append(point)
73
74 return [hull[i+1][0] - hull[i][0] for i in range(0, len(hull)-1)]
75
76
77print(upperSumHullLengths([ 1, 7, 8, 4, 2, 1, 4]))
78
prints:
1[1, 7, 8, 4, 2, 1, 4]
2Original array: [ 1, 7, 8, 4, 2, 1, 4]
3
4Prefix averages: [1.0, 4.0, 5.3, 5.0, 4.4, 3.8, 3.9]
5
6-> Delete [1, 7, 8], with maximum average 5.3
7-> New array -> [4, 2, 1, 4]
8[1, 7, 8, 4, 2, 1, 4]
9^ ^
10[4, 2, 1, 4]
11^ ^
12[2, 1, 4]
13^ ^
14[]
15from fractions import Fraction
16def find_array_reductions(nums: List[int]) -> List[int]:
17 """Return list of lengths of max average prefix reductions."""
18
19 def max_prefix_avg(arr: List[int]) -> Tuple[float, int]:
20 """Return value and length of max average prefix in arr."""
21 if len(arr) == 0:
22 return (-math.inf, 0)
23
24 best_length = 1
25 best_average = Fraction(0, 1)
26 running_sum = 0
27
28 for i, x in enumerate(arr, 1):
29 running_sum += x
30 new_average = Fraction(running_sum, i)
31 if new_average >= best_average:
32 best_average = new_average
33 best_length = i
34
35 return (float(best_average), best_length)
36
37 removed_lengths = []
38 total_removed = 0
39
40 while total_removed < len(nums):
41 _, new_removal = max_prefix_avg(nums[total_removed:])
42 removed_lengths.append(new_removal)
43 total_removed += new_removal
44
45 return removed_lengths
46# Lengths of the segments in the upper convex hull
47# of the cumulative sum graph
48def upperSumHullLengths(arr):
49 if len(arr) < 2:
50 if len(arr) < 1:
51 return []
52 else:
53 return [1]
54
55 hull = [(0, 0),(1, arr[0])]
56 for x in range(2, len(arr)+1):
57 # this has x coordinate x-1
58 prevPoint = hull[len(hull) - 1]
59 # next point in cumulative sum
60 point = (x, prevPoint[1] + arr[x-1])
61 # remove points not on the convex hull
62 while len(hull) >= 2:
63 p0 = hull[len(hull)-2]
64 dx0 = prevPoint[0] - p0[0]
65 dy0 = prevPoint[1] - p0[1]
66 dx1 = x - prevPoint[0]
67 dy1 = point[1] - prevPoint[1]
68 if dy1*dx0 < dy0*dx1:
69 break
70 hull.pop()
71 prevPoint = p0
72 hull.append(point)
73
74 return [hull[i+1][0] - hull[i][0] for i in range(0, len(hull)-1)]
75
76
77print(upperSumHullLengths([ 1, 7, 8, 4, 2, 1, 4]))
78[3, 1, 3]
79
QUESTION
Under what notion of equality are typeclass laws written?
Asked 2022-Feb-26 at 19:39Haskell typeclasses often come with laws; for instance, instances of Monoid
are expected to observe that x <> mempty = mempty <> x = x
.
Typeclass laws are often written with single-equals (=
) rather than double-equals (==
). This suggests that the notion of equality used in typeclass laws is something other than that of Eq
(which makes sense, since Eq
is not a superclass of Monoid
)
Searching around, I was unable to find any authoritative statement on the meaning of =
in typeclass laws. For instance:
- The Haskell 2010 report does not even contain the word "law" in it
- Speaking with other Haskell users, most people seem to believe that
=
usually means extensional equality or substitution but is fundamentally context-dependent. Nobody provided any authoritative source for this claim. - The Haskell wiki article on monad laws states that
=
is extensional, but, again, fails to provide a source, and I wasn't able to track down any way to contact the author of the relevant edit.
The question, then: Is there any authoritative source on or standard for the semantics for =
in typeclass laws? If so, what is it? Additionally, are there examples where the intended meaning of =
is particularly exotic?
(As a side note, treating =
extensionally can get tricky. For instance, there is a Monoid (IO a)
instance, but it's not really clear what extensional equality of IO
values looks like.)
ANSWER
Answered 2022-Feb-24 at 22:30Typeclass laws are not part of the Haskell language, so they are not subject to the same kind of language-theoretic semantic analysis as the language itself.
Instead, these laws are typically presented as an informal mathematical notation. Most presentations do not need a more detailed mathematical exposition, so they do not provide one.
QUESTION
Is there an identity index value in JavaScript?
Asked 2022-Feb-13 at 18:48In JavaScript, values of objects and arrays can be indexed like the following: objOrArray[index]
. Is there an identity "index" value?
In other words:
Is there a value of x
that makes the following always true?
1let a = [1, 2, 3, 4];
2/* Is this true? */ a[x] == a
3
4let b = { a: 1, b: 2, c: 3 };
5/* Is this true? */ b[x] == b
6
Definition of an identity in this context: https://en.wikipedia.org/wiki/Identity_function
ANSWER
Answered 2021-Oct-05 at 01:31The indexing operation doesn't have an identity element. The domain and range of indexing is not necessarily the same -- the domain is arrays and objects, but the range is any type of object, since array elements and object properties can hold any type. If you have an array of integers, the domain is Array
, while the range is Integer
, so it's not possible for there to be an identity. a[x]
will always be an integer, which can never be equal to the array itself.
And even if you have an array of arrays, there's no reason to expect any of the elements to be a reference to the array itself. It's possible to create self-referential arrays like this, but most are not. And even if it is, the self-reference could be in any index, so there's no unique identity value.
QUESTION
Error [ERR_REQUIRE_ESM]: require() of ES Module not supported
Asked 2022-Feb-03 at 22:08I'm trying to make a Discord bot that just says if someone is online on the game.
However I keep getting this message:
[ERR_REQUIRE_ESM]: require() of ES Module from not supported. Instead change the require of index.js in... to a dynamic import() which is available in all CommonJS modules.
This is my code:
1 module.exports = {
2 name: 'username',
3 description: "this is the username command",
4 async execute(message, args) {
5
6 const fetch = require('node-fetch');
7
8 if (args.length !== 1) {
9 return message.channel.send("invalid username wtf")
10 }
11
12 const ign = args[0]
13
14 if (ign.length > 16 || ign.length < 3) {
15 return message.channel.send("invalid username wtf")
16 }
17
18 const uuid = await fetch(`https://api.mojang.com/users/profiles/minecraft/${ign}`).then(data => data.json()).then(data => data.id).catch(err => message.channel.send("error wtf"));
19 const onlineInfo = await fetch(`https://api.hypixel.net/status?key=${john}&uuid=${uuid}`).then(data => data.json());
20
21 if (uuid.length !== 32) {
22 return;
23 }
24
25 if (onlineinfo.success) {
26 if (onlineinfo.session.online) {
27 message.channel.send("they are online")
28 }
29 else {
30 message.channel.send("they are offline")
31 }
32 }
33 else {
34 message.channel.send("hypixel api bad wtf")
35 }
36 }
37 }
38
This is my package.json file:
1 module.exports = {
2 name: 'username',
3 description: "this is the username command",
4 async execute(message, args) {
5
6 const fetch = require('node-fetch');
7
8 if (args.length !== 1) {
9 return message.channel.send("invalid username wtf")
10 }
11
12 const ign = args[0]
13
14 if (ign.length > 16 || ign.length < 3) {
15 return message.channel.send("invalid username wtf")
16 }
17
18 const uuid = await fetch(`https://api.mojang.com/users/profiles/minecraft/${ign}`).then(data => data.json()).then(data => data.id).catch(err => message.channel.send("error wtf"));
19 const onlineInfo = await fetch(`https://api.hypixel.net/status?key=${john}&uuid=${uuid}`).then(data => data.json());
20
21 if (uuid.length !== 32) {
22 return;
23 }
24
25 if (onlineinfo.success) {
26 if (onlineinfo.session.online) {
27 message.channel.send("they are online")
28 }
29 else {
30 message.channel.send("they are offline")
31 }
32 }
33 else {
34 message.channel.send("hypixel api bad wtf")
35 }
36 }
37 }
38{
39 "name": "discordbot",
40 "version": "1.0.0",
41 "main": "main.js",
42 "scripts": {
43 "test": "echo \"Error: no test specified\" && exit 1",
44 "start": "node main.js"
45 },
46 "author": "",
47 "license": "ISC",
48 "description": "",
49 "dependencies": {
50 "discord.js": "^13.0.1",
51 "node-fetch": "^3.0.0"
52 }
53}
54
ANSWER
Answered 2021-Sep-07 at 06:38node-fetch
v3 recently stopped support for the require
way of importing it in favor of ES Modules. You'll need to use ESM imports now, like:
1 module.exports = {
2 name: 'username',
3 description: "this is the username command",
4 async execute(message, args) {
5
6 const fetch = require('node-fetch');
7
8 if (args.length !== 1) {
9 return message.channel.send("invalid username wtf")
10 }
11
12 const ign = args[0]
13
14 if (ign.length > 16 || ign.length < 3) {
15 return message.channel.send("invalid username wtf")
16 }
17
18 const uuid = await fetch(`https://api.mojang.com/users/profiles/minecraft/${ign}`).then(data => data.json()).then(data => data.id).catch(err => message.channel.send("error wtf"));
19 const onlineInfo = await fetch(`https://api.hypixel.net/status?key=${john}&uuid=${uuid}`).then(data => data.json());
20
21 if (uuid.length !== 32) {
22 return;
23 }
24
25 if (onlineinfo.success) {
26 if (onlineinfo.session.online) {
27 message.channel.send("they are online")
28 }
29 else {
30 message.channel.send("they are offline")
31 }
32 }
33 else {
34 message.channel.send("hypixel api bad wtf")
35 }
36 }
37 }
38{
39 "name": "discordbot",
40 "version": "1.0.0",
41 "main": "main.js",
42 "scripts": {
43 "test": "echo \"Error: no test specified\" && exit 1",
44 "start": "node main.js"
45 },
46 "author": "",
47 "license": "ISC",
48 "description": "",
49 "dependencies": {
50 "discord.js": "^13.0.1",
51 "node-fetch": "^3.0.0"
52 }
53}
54import fetch from "node-fetch";
55
at the top of your file.
QUESTION
Log4j vulnerability - Is Log4j 1.2.17 vulnerable (was unable to find any JNDI code in source)?
Asked 2022-Feb-01 at 15:47With regard to the Log4j JNDI remote code execution vulnerability that has been identified CVE-2021-44228 - (also see references) - I wondered if Log4j-v1.2 is also impacted, but the closest I got from source code review is the JMS-Appender.
The question is, while the posts on the Internet indicate that Log4j 1.2 is also vulnerable, I am not able to find the relevant source code for it.
Am I missing something that others have identified?
Log4j 1.2 appears to have a vulnerability in the socket-server class, but my understanding is that it needs to be enabled in the first place for it to be applicable and hence is not a passive threat unlike the JNDI-lookup vulnerability which the one identified appears to be.
Is my understanding - that Log4j v1.2 - is not vulnerable to the jndi-remote-code execution bug correct?
ReferencesThis blog post from Cloudflare also indicates the same point as from AKX....that it was introduced from Log4j 2!
Update #1 - A fork of the (now-retired) apache-log4j-1.2.x with patch fixes for few vulnerabilities identified in the older library is now available (from the original log4j author). The site is https://reload4j.qos.ch/. As of 21-Jan-2022 version 1.2.18.2 has been released. Vulnerabilities addressed to date include those pertaining to JMSAppender, SocketServer and Chainsaw vulnerabilities. Note that I am simply relaying this information. Have not verified the fixes from my end. Please refer the link for additional details.
ANSWER
Answered 2022-Jan-01 at 18:43The JNDI feature was added into Log4j 2.0-beta9.
Log4j 1.x thus does not have the vulnerable code.
QUESTION
Why is QuackSort 2x faster than Data.List's sort for random lists?
Asked 2022-Jan-27 at 19:24I was looking for the canonical implementation of MergeSort on Haskell to port to HOVM, and I found this StackOverflow answer. When porting the algorithm, I realized something looked silly: the algorithm has a "halve" function that does nothing but split a list in two, using half of the length, before recursing and merging. So I thought: why not make a better use of this pass, and use a pivot, to make each half respectively smaller and bigger than that pivot? That would increase the odds that recursive merge calls are applied to already-sorted lists, which might speed up the algorithm!
I've done this change, resulting in the following code:
1import Data.List
2import Data.Word
3
4randomList :: Word32 -> Word32 -> [Word32]
5randomList seed 0 = []
6randomList seed size = seed : randomList (seed * 1664525 + 1013904223) (size - 1)
7
8quacksort :: [Word32] -> [Word32]
9quacksort [] = []
10quacksort [x] = [x]
11quacksort (p : x : xs) = split p (p : x : xs) [] [] where
12
13 -- Splits the list in two halves of elements smaller/bigger than a pivot
14 split p [] as bs = merge (quacksort as) (quacksort bs)
15 split p (x : xs) as bs = quack p (p < x) x xs as bs
16
17 -- Helper function for `split`
18 quack p False x xs as bs = split p xs (x : as) bs
19 quack p True x xs as bs = split p xs as (x : bs)
20
21 -- Merges two lists as a sorted one
22 merge [] ys = ys
23 merge xs [] = xs
24 merge (x : xs) (y : ys) = place (x < y) x xs y ys
25
26 -- Helper function for `merge`
27 place False x xs y ys = y : merge (x : xs) ys
28 place True x xs y ys = x : merge xs (y : ys)
29
30main :: IO ()
31main = do
32 let l = randomList 0 2000000
33 let b = quacksort l
34 print $ sum b
35
I then benchmarked it and, to my surprise, it was, indeed, 2x faster than Haskell's official Data.List
sort. So I wondered why this isn't used in practice, and, suddenly, I realized the obvious: mergesort does NOT perform better on already sorted lists. D'oh. So the whole assumption behind quacksort was failed. Not only that, it would perform terribly for reversely sorted lists, since it would fail to produce two halves of similar size (except if we could guess a really good pivot). So, quacksort is wack in all cases and should never be used in practice. But, then...
Why the hell does it perform 2x faster than Data.List's sort for random lists?
I can't think of a good reason this should be the case. Making each half smaller/bigger than a pivot doesn't change how many times the merge call must be called, so it shouldn't have any positive effect. But reverting it back to a conventional mergesort does make it 2x slower, so, for some reason, the ordered split helps.
ANSWER
Answered 2022-Jan-27 at 19:15Your split
splits the list in two ordered halves, so merge
consumes its first argument first and then just produces the second half in full. In other words it is equivalent to ++
, doing redundant comparisons on the first half which always turn out to be True
.
In the true mergesort the merge actually does twice the work on random data because the two parts are not ordered.
The split
though spends some work on the partitioning whereas an online bottom-up mergesort would spend no work there at all. But the built-in sort tries to detect ordered runs in the input, and apparently that extra work is not negligible.
QUESTION
Resource linking fails on lStar
Asked 2022-Jan-21 at 09:25I'm working on a React Native application. My Android builds began to fail in the CI environment (and locally) without any changes.
1Execution failed for task ':app:processDevelopmentDebugResources'.
2
3> A failure occurred while executing com.android.build.gradle.internal.tasks.Workers$ActionFacade
4 > Android resource linking failed
5.../app/build/intermediates/incremental/mergeDevelopmentDebugResources/merged.dir/values/values.xml:2682: AAPT: error: resource android:attr/lStar not found.
6
According to Android: Resource linking fails on test execution even when nothing has been changed, this happened because some library got upgraded.
lStar needs compileSdkVersion 31 and my project used compileSdkVersion 28.
How can I track which libraries got updated recently, or which library is causing this?
ANSWER
Answered 2021-Sep-03 at 11:46Go to your package.json file and delete as many dependencies as you can until the project builds successfully. Then start adding back the dependencies one by one to detect which ones have troubles.
Then you can manually patch those dependencies by acceding them on node_modules/[dependencie]/android/build.gradle and setting androidx.core:core-ktx: or androidx.core:core: to a specific version (1.6.0 in my case).
QUESTION
Bubble sort slower with -O3 than -O2 with GCC
Asked 2022-Jan-21 at 02:41I made a bubble sort implementation in C, and was testing its performance when I noticed that the -O3
flag made it run even slower than no flags at all! Meanwhile -O2
was making it run a lot faster as expected.
Without optimisations:
1time ./sort 30000
2
3./sort 30000 1.82s user 0.00s system 99% cpu 1.816 total
4
-O2
:
1time ./sort 30000
2
3./sort 30000 1.82s user 0.00s system 99% cpu 1.816 total
4time ./sort 30000
5
6./sort 30000 1.00s user 0.00s system 99% cpu 1.005 total
7
-O3
:
1time ./sort 30000
2
3./sort 30000 1.82s user 0.00s system 99% cpu 1.816 total
4time ./sort 30000
5
6./sort 30000 1.00s user 0.00s system 99% cpu 1.005 total
7time ./sort 30000
8
9./sort 30000 2.01s user 0.00s system 99% cpu 2.007 total
10
The code:
1time ./sort 30000
2
3./sort 30000 1.82s user 0.00s system 99% cpu 1.816 total
4time ./sort 30000
5
6./sort 30000 1.00s user 0.00s system 99% cpu 1.005 total
7time ./sort 30000
8
9./sort 30000 2.01s user 0.00s system 99% cpu 2.007 total
10#include <stdio.h>
11#include <stdlib.h>
12#include <stdbool.h>
13#include <time.h>
14
15int n;
16
17void bubblesort(int *buf)
18{
19 bool changed = true;
20 for (int i = n; changed == true; i--) { /* will always move at least one element to its rightful place at the end, so can shorten the search by 1 each iteration */
21 changed = false;
22
23 for (int x = 0; x < i-1; x++) {
24 if (buf[x] > buf[x+1]) {
25 /* swap */
26 int tmp = buf[x+1];
27 buf[x+1] = buf[x];
28 buf[x] = tmp;
29
30 changed = true;
31 }
32 }
33 }
34}
35
36int main(int argc, char *argv[])
37{
38 if (argc != 2) {
39 fprintf(stderr, "Usage: %s <arraysize>\n", argv[0]);
40 return EXIT_FAILURE;
41 }
42
43 n = atoi(argv[1]);
44 if (n < 1) {
45 fprintf(stderr, "Invalid array size.\n");
46 return EXIT_FAILURE;
47 }
48
49 int *buf = malloc(sizeof(int) * n);
50
51 /* init buffer with random values */
52 srand(time(NULL));
53 for (int i = 0; i < n; i++)
54 buf[i] = rand() % n + 1;
55
56 bubblesort(buf);
57
58 return EXIT_SUCCESS;
59}
60
The assembly language generated for -O2
(from godbolt.org):
1time ./sort 30000
2
3./sort 30000 1.82s user 0.00s system 99% cpu 1.816 total
4time ./sort 30000
5
6./sort 30000 1.00s user 0.00s system 99% cpu 1.005 total
7time ./sort 30000
8
9./sort 30000 2.01s user 0.00s system 99% cpu 2.007 total
10#include <stdio.h>
11#include <stdlib.h>
12#include <stdbool.h>
13#include <time.h>
14
15int n;
16
17void bubblesort(int *buf)
18{
19 bool changed = true;
20 for (int i = n; changed == true; i--) { /* will always move at least one element to its rightful place at the end, so can shorten the search by 1 each iteration */
21 changed = false;
22
23 for (int x = 0; x < i-1; x++) {
24 if (buf[x] > buf[x+1]) {
25 /* swap */
26 int tmp = buf[x+1];
27 buf[x+1] = buf[x];
28 buf[x] = tmp;
29
30 changed = true;
31 }
32 }
33 }
34}
35
36int main(int argc, char *argv[])
37{
38 if (argc != 2) {
39 fprintf(stderr, "Usage: %s <arraysize>\n", argv[0]);
40 return EXIT_FAILURE;
41 }
42
43 n = atoi(argv[1]);
44 if (n < 1) {
45 fprintf(stderr, "Invalid array size.\n");
46 return EXIT_FAILURE;
47 }
48
49 int *buf = malloc(sizeof(int) * n);
50
51 /* init buffer with random values */
52 srand(time(NULL));
53 for (int i = 0; i < n; i++)
54 buf[i] = rand() % n + 1;
55
56 bubblesort(buf);
57
58 return EXIT_SUCCESS;
59}
60bubblesort:
61 mov r9d, DWORD PTR n[rip]
62 xor edx, edx
63 xor r10d, r10d
64.L2:
65 lea r8d, [r9-1]
66 cmp r8d, edx
67 jle .L13
68.L5:
69 movsx rax, edx
70 lea rax, [rdi+rax*4]
71.L4:
72 mov esi, DWORD PTR [rax]
73 mov ecx, DWORD PTR [rax+4]
74 add edx, 1
75 cmp esi, ecx
76 jle .L2
77 mov DWORD PTR [rax+4], esi
78 mov r10d, 1
79 add rax, 4
80 mov DWORD PTR [rax-4], ecx
81 cmp r8d, edx
82 jg .L4
83 mov r9d, r8d
84 xor edx, edx
85 xor r10d, r10d
86 lea r8d, [r9-1]
87 cmp r8d, edx
88 jg .L5
89.L13:
90 test r10b, r10b
91 jne .L14
92.L1:
93 ret
94.L14:
95 lea eax, [r9-2]
96 cmp r9d, 2
97 jle .L1
98 mov r9d, r8d
99 xor edx, edx
100 mov r8d, eax
101 xor r10d, r10d
102 jmp .L5
103
And the same for -O3
:
1time ./sort 30000
2
3./sort 30000 1.82s user 0.00s system 99% cpu 1.816 total
4time ./sort 30000
5
6./sort 30000 1.00s user 0.00s system 99% cpu 1.005 total
7time ./sort 30000
8
9./sort 30000 2.01s user 0.00s system 99% cpu 2.007 total
10#include <stdio.h>
11#include <stdlib.h>
12#include <stdbool.h>
13#include <time.h>
14
15int n;
16
17void bubblesort(int *buf)
18{
19 bool changed = true;
20 for (int i = n; changed == true; i--) { /* will always move at least one element to its rightful place at the end, so can shorten the search by 1 each iteration */
21 changed = false;
22
23 for (int x = 0; x < i-1; x++) {
24 if (buf[x] > buf[x+1]) {
25 /* swap */
26 int tmp = buf[x+1];
27 buf[x+1] = buf[x];
28 buf[x] = tmp;
29
30 changed = true;
31 }
32 }
33 }
34}
35
36int main(int argc, char *argv[])
37{
38 if (argc != 2) {
39 fprintf(stderr, "Usage: %s <arraysize>\n", argv[0]);
40 return EXIT_FAILURE;
41 }
42
43 n = atoi(argv[1]);
44 if (n < 1) {
45 fprintf(stderr, "Invalid array size.\n");
46 return EXIT_FAILURE;
47 }
48
49 int *buf = malloc(sizeof(int) * n);
50
51 /* init buffer with random values */
52 srand(time(NULL));
53 for (int i = 0; i < n; i++)
54 buf[i] = rand() % n + 1;
55
56 bubblesort(buf);
57
58 return EXIT_SUCCESS;
59}
60bubblesort:
61 mov r9d, DWORD PTR n[rip]
62 xor edx, edx
63 xor r10d, r10d
64.L2:
65 lea r8d, [r9-1]
66 cmp r8d, edx
67 jle .L13
68.L5:
69 movsx rax, edx
70 lea rax, [rdi+rax*4]
71.L4:
72 mov esi, DWORD PTR [rax]
73 mov ecx, DWORD PTR [rax+4]
74 add edx, 1
75 cmp esi, ecx
76 jle .L2
77 mov DWORD PTR [rax+4], esi
78 mov r10d, 1
79 add rax, 4
80 mov DWORD PTR [rax-4], ecx
81 cmp r8d, edx
82 jg .L4
83 mov r9d, r8d
84 xor edx, edx
85 xor r10d, r10d
86 lea r8d, [r9-1]
87 cmp r8d, edx
88 jg .L5
89.L13:
90 test r10b, r10b
91 jne .L14
92.L1:
93 ret
94.L14:
95 lea eax, [r9-2]
96 cmp r9d, 2
97 jle .L1
98 mov r9d, r8d
99 xor edx, edx
100 mov r8d, eax
101 xor r10d, r10d
102 jmp .L5
103bubblesort:
104 mov r9d, DWORD PTR n[rip]
105 xor edx, edx
106 xor r10d, r10d
107.L2:
108 lea r8d, [r9-1]
109 cmp r8d, edx
110 jle .L13
111.L5:
112 movsx rax, edx
113 lea rcx, [rdi+rax*4]
114.L4:
115 movq xmm0, QWORD PTR [rcx]
116 add edx, 1
117 pshufd xmm2, xmm0, 0xe5
118 movd esi, xmm0
119 movd eax, xmm2
120 pshufd xmm1, xmm0, 225
121 cmp esi, eax
122 jle .L2
123 movq QWORD PTR [rcx], xmm1
124 mov r10d, 1
125 add rcx, 4
126 cmp r8d, edx
127 jg .L4
128 mov r9d, r8d
129 xor edx, edx
130 xor r10d, r10d
131 lea r8d, [r9-1]
132 cmp r8d, edx
133 jg .L5
134.L13:
135 test r10b, r10b
136 jne .L14
137.L1:
138 ret
139.L14:
140 lea eax, [r9-2]
141 cmp r9d, 2
142 jle .L1
143 mov r9d, r8d
144 xor edx, edx
145 mov r8d, eax
146 xor r10d, r10d
147 jmp .L5
148
It seems like the only significant difference to me is the apparent attempt to use SIMD, which seems like it should be a large improvement, but I also can't tell what on earth it's attempting with those pshufd
instructions... is this just a failed attempt at SIMD? Or maybe the couple of extra instructions is just about edging out my instruction cache?
Timings were done on an AMD Ryzen 5 3600.
ANSWER
Answered 2021-Oct-27 at 19:53It looks like GCC's naïveté about store-forwarding stalls is hurting its auto-vectorization strategy here. See also Store forwarding by example for some practical benchmarks on Intel with hardware performance counters, and What are the costs of failed store-to-load forwarding on x86? Also Agner Fog's x86 optimization guides.
(gcc -O3
enables -ftree-vectorize
and a few other options not included by -O2
, e.g. if
-conversion to branchless cmov
, which is another way -O3
can hurt with data patterns GCC didn't expect. By comparison, Clang enables auto-vectorization even at -O2
, although some of its optimizations are still only on at -O3
.)
It's doing 64-bit loads (and branching to store or not) on pairs of ints. This means, if we swapped the last iteration, this load comes half from that store, half from fresh memory, so we get a store-forwarding stall after every swap. But bubble sort often has long chains of swapping every iteration as an element bubbles far, so this is really bad.
(Bubble sort is bad in general, especially if implemented naively without keeping the previous iteration's second element around in a register. It can be interesting to analyze the asm details of exactly why it sucks, so it is fair enough for wanting to try.)
Anyway, this is pretty clearly an anti-optimization you should report on GCC Bugzilla with the "missed-optimization" keyword. Scalar loads are cheap, and store-forwarding stalls are costly. (Can modern x86 implementations store-forward from more than one prior store? no, nor can microarchitectures other than in-order Atom efficiently load when it partially overlaps with one previous store, and partially from data that has to come from the L1d cache.)
Even better would be to keep buf[x+1]
in a register and use it as buf[x]
in the next iteration, avoiding a store and load. (Like good hand-written asm bubble sort examples, a few of which exist on Stack Overflow.)
If it wasn't for the store-forwarding stalls (which AFAIK GCC doesn't know about in its cost model), this strategy might be about break-even. SSE 4.1 for a branchless pmind
/ pmaxd
comparator might be interesting, but that would mean always storing and the C source doesn't do that.
If this strategy of double-width load had any merit, it would be better implemented with pure integer on a 64-bit machine like x86-64, where you can operate on just the low 32 bits with garbage (or valuable data) in the upper half. E.g.,
1time ./sort 30000
2
3./sort 30000 1.82s user 0.00s system 99% cpu 1.816 total
4time ./sort 30000
5
6./sort 30000 1.00s user 0.00s system 99% cpu 1.005 total
7time ./sort 30000
8
9./sort 30000 2.01s user 0.00s system 99% cpu 2.007 total
10#include <stdio.h>
11#include <stdlib.h>
12#include <stdbool.h>
13#include <time.h>
14
15int n;
16
17void bubblesort(int *buf)
18{
19 bool changed = true;
20 for (int i = n; changed == true; i--) { /* will always move at least one element to its rightful place at the end, so can shorten the search by 1 each iteration */
21 changed = false;
22
23 for (int x = 0; x < i-1; x++) {
24 if (buf[x] > buf[x+1]) {
25 /* swap */
26 int tmp = buf[x+1];
27 buf[x+1] = buf[x];
28 buf[x] = tmp;
29
30 changed = true;
31 }
32 }
33 }
34}
35
36int main(int argc, char *argv[])
37{
38 if (argc != 2) {
39 fprintf(stderr, "Usage: %s <arraysize>\n", argv[0]);
40 return EXIT_FAILURE;
41 }
42
43 n = atoi(argv[1]);
44 if (n < 1) {
45 fprintf(stderr, "Invalid array size.\n");
46 return EXIT_FAILURE;
47 }
48
49 int *buf = malloc(sizeof(int) * n);
50
51 /* init buffer with random values */
52 srand(time(NULL));
53 for (int i = 0; i < n; i++)
54 buf[i] = rand() % n + 1;
55
56 bubblesort(buf);
57
58 return EXIT_SUCCESS;
59}
60bubblesort:
61 mov r9d, DWORD PTR n[rip]
62 xor edx, edx
63 xor r10d, r10d
64.L2:
65 lea r8d, [r9-1]
66 cmp r8d, edx
67 jle .L13
68.L5:
69 movsx rax, edx
70 lea rax, [rdi+rax*4]
71.L4:
72 mov esi, DWORD PTR [rax]
73 mov ecx, DWORD PTR [rax+4]
74 add edx, 1
75 cmp esi, ecx
76 jle .L2
77 mov DWORD PTR [rax+4], esi
78 mov r10d, 1
79 add rax, 4
80 mov DWORD PTR [rax-4], ecx
81 cmp r8d, edx
82 jg .L4
83 mov r9d, r8d
84 xor edx, edx
85 xor r10d, r10d
86 lea r8d, [r9-1]
87 cmp r8d, edx
88 jg .L5
89.L13:
90 test r10b, r10b
91 jne .L14
92.L1:
93 ret
94.L14:
95 lea eax, [r9-2]
96 cmp r9d, 2
97 jle .L1
98 mov r9d, r8d
99 xor edx, edx
100 mov r8d, eax
101 xor r10d, r10d
102 jmp .L5
103bubblesort:
104 mov r9d, DWORD PTR n[rip]
105 xor edx, edx
106 xor r10d, r10d
107.L2:
108 lea r8d, [r9-1]
109 cmp r8d, edx
110 jle .L13
111.L5:
112 movsx rax, edx
113 lea rcx, [rdi+rax*4]
114.L4:
115 movq xmm0, QWORD PTR [rcx]
116 add edx, 1
117 pshufd xmm2, xmm0, 0xe5
118 movd esi, xmm0
119 movd eax, xmm2
120 pshufd xmm1, xmm0, 225
121 cmp esi, eax
122 jle .L2
123 movq QWORD PTR [rcx], xmm1
124 mov r10d, 1
125 add rcx, 4
126 cmp r8d, edx
127 jg .L4
128 mov r9d, r8d
129 xor edx, edx
130 xor r10d, r10d
131 lea r8d, [r9-1]
132 cmp r8d, edx
133 jg .L5
134.L13:
135 test r10b, r10b
136 jne .L14
137.L1:
138 ret
139.L14:
140 lea eax, [r9-2]
141 cmp r9d, 2
142 jle .L1
143 mov r9d, r8d
144 xor edx, edx
145 mov r8d, eax
146 xor r10d, r10d
147 jmp .L5
148## What GCC should have done,
149## if it was going to use this 64-bit load strategy at all
150
151 movsx rax, edx # apparently it wasn't able to optimize away your half-width signed loop counter into pointer math
152 lea rcx, [rdi+rax*4] # Usually not worth an extra instruction just to avoid an indexed load and indexed store, but let's keep it for easy comparison.
153.L4:
154 mov rax, [rcx] # into RAX instead of XMM0
155 add edx, 1
156 # pshufd xmm2, xmm0, 0xe5
157 # movd esi, xmm0
158 # movd eax, xmm2
159 # pshufd xmm1, xmm0, 225
160 mov rsi, rax
161 rol rax, 32 # swap halves, just like the pshufd
162 cmp esi, eax # or eax, esi? I didn't check which is which
163 jle .L2
164 movq QWORD PTR [rcx], rax # conditionally store the swapped qword
165
(Or with BMI2 available from -march=native
, rorx rsi, rax, 32
can copy-and-swap in one uop. Without BMI2, mov
and swapping the original instead of the copy saves latency if running on a CPU without mov-elimination, such as Ice Lake with updated microcode.)
So total latency from load to compare is just integer load + one ALU operation (rotate). Vs. XMM load -> movd
. And its fewer ALU uops.
This does nothing to help with the store-forwarding stall problem, though, which is still a showstopper. This is just an integer SWAR implementation of the same strategy, replacing 2x pshufd and 2x movd r32, xmm
with just mov
+ rol
.
Actually, there's no reason to use 2x pshufd
here. Even if using XMM registers, GCC could have done one shuffle that swapped the low two elements, setting up for both the store and movd
. So even with XMM regs, this was sub-optimal. But clearly two different parts of GCC emitted those two pshufd
instructions; one even printed the shuffle constant in hex while the other used decimal! I assume one swapping and the other just trying to get vec[1]
, the high element of the qword.
slower than no flags at all
The default is -O0
, consistent-debugging mode that spills all variables to memory after every C statement, so it's pretty horrible and creates big store-forwarding latency bottlenecks. (Somewhat like if every variable was volatile
.) But it's successful store forwarding, not stalls, so "only" ~5 cycles, but still much worse than 0 for registers. (A few modern microarchitectures including Zen 2 have some special cases that are lower latency). The extra store and load instructions that have to go through the pipeline don't help.
It's generally not interesting to benchmark -O0
. -O1
or -Og
should be your go-to baseline for the compiler to do the basic amount of optimization a normal person would expect, without anything fancy, but also not intentionally gimp the asm by skipping register allocation.
Semi-related: optimizing bubble sort for size instead of speed can involve memory-destination rotate (creating store-forwarding stalls for back-to-back swaps), or a memory-destination xchg
(implicit lock
prefix -> very slow). See this Code Golf answer.
QUESTION
Efficient summation in Python
Asked 2022-Jan-16 at 12:49I am trying to efficiently compute a summation of a summation in Python:
WolframAlpha is able to compute it too a high n value: sum of sum.
I have two approaches: a for loop method and an np.sum method. I thought the np.sum approach would be faster. However, they are the same until a large n, after which the np.sum has overflow errors and gives the wrong result.
I am trying to find the fastest way to compute this sum.
1import numpy as np
2import time
3
4def summation(start,end,func):
5 sum=0
6 for i in range(start,end+1):
7 sum+=func(i)
8 return sum
9
10def x(y):
11 return y
12
13def x2(y):
14 return y**2
15
16def mysum(y):
17 return x2(y)*summation(0, y, x)
18
19n=100
20
21# method #1
22start=time.time()
23summation(0,n,mysum)
24print('Slow method:',time.time()-start)
25
26# method #2
27start=time.time()
28w=np.arange(0,n+1)
29(w**2*np.cumsum(w)).sum()
30print('Fast method:',time.time()-start)
31
ANSWER
Answered 2022-Jan-16 at 12:49(fastest methods, 3 and 4, are at the end)
In a fast NumPy method you need to specify dtype=np.object
so that NumPy does not convert Python int
to its own dtypes (np.int64
or others). It will now give you correct results (checked it up to N=100000).
1import numpy as np
2import time
3
4def summation(start,end,func):
5 sum=0
6 for i in range(start,end+1):
7 sum+=func(i)
8 return sum
9
10def x(y):
11 return y
12
13def x2(y):
14 return y**2
15
16def mysum(y):
17 return x2(y)*summation(0, y, x)
18
19n=100
20
21# method #1
22start=time.time()
23summation(0,n,mysum)
24print('Slow method:',time.time()-start)
25
26# method #2
27start=time.time()
28w=np.arange(0,n+1)
29(w**2*np.cumsum(w)).sum()
30print('Fast method:',time.time()-start)
31# method #2
32start=time.time()
33w=np.arange(0, n+1, dtype=np.object)
34result2 = (w**2*np.cumsum(w)).sum()
35print('Fast method:', time.time()-start)
36
Your fast solution is significantly faster than the slow one. Yes, for large N's, but already at N=100 it is like 8 times faster:
1import numpy as np
2import time
3
4def summation(start,end,func):
5 sum=0
6 for i in range(start,end+1):
7 sum+=func(i)
8 return sum
9
10def x(y):
11 return y
12
13def x2(y):
14 return y**2
15
16def mysum(y):
17 return x2(y)*summation(0, y, x)
18
19n=100
20
21# method #1
22start=time.time()
23summation(0,n,mysum)
24print('Slow method:',time.time()-start)
25
26# method #2
27start=time.time()
28w=np.arange(0,n+1)
29(w**2*np.cumsum(w)).sum()
30print('Fast method:',time.time()-start)
31# method #2
32start=time.time()
33w=np.arange(0, n+1, dtype=np.object)
34result2 = (w**2*np.cumsum(w)).sum()
35print('Fast method:', time.time()-start)
36start=time.time()
37for i in range(100):
38 result1 = summation(0, n, mysum)
39print('Slow method:', time.time()-start)
40
41# method #2
42start=time.time()
43for i in range(100):
44 w=np.arange(0, n+1, dtype=np.object)
45 result2 = (w**2*np.cumsum(w)).sum()
46print('Fast method:', time.time()-start)
47
1import numpy as np
2import time
3
4def summation(start,end,func):
5 sum=0
6 for i in range(start,end+1):
7 sum+=func(i)
8 return sum
9
10def x(y):
11 return y
12
13def x2(y):
14 return y**2
15
16def mysum(y):
17 return x2(y)*summation(0, y, x)
18
19n=100
20
21# method #1
22start=time.time()
23summation(0,n,mysum)
24print('Slow method:',time.time()-start)
25
26# method #2
27start=time.time()
28w=np.arange(0,n+1)
29(w**2*np.cumsum(w)).sum()
30print('Fast method:',time.time()-start)
31# method #2
32start=time.time()
33w=np.arange(0, n+1, dtype=np.object)
34result2 = (w**2*np.cumsum(w)).sum()
35print('Fast method:', time.time()-start)
36start=time.time()
37for i in range(100):
38 result1 = summation(0, n, mysum)
39print('Slow method:', time.time()-start)
40
41# method #2
42start=time.time()
43for i in range(100):
44 w=np.arange(0, n+1, dtype=np.object)
45 result2 = (w**2*np.cumsum(w)).sum()
46print('Fast method:', time.time()-start)
47Slow method: 0.06906533241271973
48Fast method: 0.008007287979125977
49
EDIT: Even faster method (by KellyBundy, the Pumpkin) is by using pure python. Turns out NumPy has no advantage here, because it has no vectorized code for np.objects
.
1import numpy as np
2import time
3
4def summation(start,end,func):
5 sum=0
6 for i in range(start,end+1):
7 sum+=func(i)
8 return sum
9
10def x(y):
11 return y
12
13def x2(y):
14 return y**2
15
16def mysum(y):
17 return x2(y)*summation(0, y, x)
18
19n=100
20
21# method #1
22start=time.time()
23summation(0,n,mysum)
24print('Slow method:',time.time()-start)
25
26# method #2
27start=time.time()
28w=np.arange(0,n+1)
29(w**2*np.cumsum(w)).sum()
30print('Fast method:',time.time()-start)
31# method #2
32start=time.time()
33w=np.arange(0, n+1, dtype=np.object)
34result2 = (w**2*np.cumsum(w)).sum()
35print('Fast method:', time.time()-start)
36start=time.time()
37for i in range(100):
38 result1 = summation(0, n, mysum)
39print('Slow method:', time.time()-start)
40
41# method #2
42start=time.time()
43for i in range(100):
44 w=np.arange(0, n+1, dtype=np.object)
45 result2 = (w**2*np.cumsum(w)).sum()
46print('Fast method:', time.time()-start)
47Slow method: 0.06906533241271973
48Fast method: 0.008007287979125977
49# method #3
50import itertools
51start=time.time()
52for i in range(100):
53 result3 = sum(x*x * ysum for x, ysum in enumerate(itertools.accumulate(range(n+1))))
54print('Faster, pure python:', (time.time()-start))
55
1import numpy as np
2import time
3
4def summation(start,end,func):
5 sum=0
6 for i in range(start,end+1):
7 sum+=func(i)
8 return sum
9
10def x(y):
11 return y
12
13def x2(y):
14 return y**2
15
16def mysum(y):
17 return x2(y)*summation(0, y, x)
18
19n=100
20
21# method #1
22start=time.time()
23summation(0,n,mysum)
24print('Slow method:',time.time()-start)
25
26# method #2
27start=time.time()
28w=np.arange(0,n+1)
29(w**2*np.cumsum(w)).sum()
30print('Fast method:',time.time()-start)
31# method #2
32start=time.time()
33w=np.arange(0, n+1, dtype=np.object)
34result2 = (w**2*np.cumsum(w)).sum()
35print('Fast method:', time.time()-start)
36start=time.time()
37for i in range(100):
38 result1 = summation(0, n, mysum)
39print('Slow method:', time.time()-start)
40
41# method #2
42start=time.time()
43for i in range(100):
44 w=np.arange(0, n+1, dtype=np.object)
45 result2 = (w**2*np.cumsum(w)).sum()
46print('Fast method:', time.time()-start)
47Slow method: 0.06906533241271973
48Fast method: 0.008007287979125977
49# method #3
50import itertools
51start=time.time()
52for i in range(100):
53 result3 = sum(x*x * ysum for x, ysum in enumerate(itertools.accumulate(range(n+1))))
54print('Faster, pure python:', (time.time()-start))
55Faster, pure python: 0.0009944438934326172
56
EDIT2: Forss noticed that numpy fast method can be optimized by using x*x
instead of x**2
. For N > 200
it is faster than pure Python method. For N < 200
it is slower than pure Python method (the exact value of boundary may depend on machine, on mine it was 200, its best to check it yourself):
1import numpy as np
2import time
3
4def summation(start,end,func):
5 sum=0
6 for i in range(start,end+1):
7 sum+=func(i)
8 return sum
9
10def x(y):
11 return y
12
13def x2(y):
14 return y**2
15
16def mysum(y):
17 return x2(y)*summation(0, y, x)
18
19n=100
20
21# method #1
22start=time.time()
23summation(0,n,mysum)
24print('Slow method:',time.time()-start)
25
26# method #2
27start=time.time()
28w=np.arange(0,n+1)
29(w**2*np.cumsum(w)).sum()
30print('Fast method:',time.time()-start)
31# method #2
32start=time.time()
33w=np.arange(0, n+1, dtype=np.object)
34result2 = (w**2*np.cumsum(w)).sum()
35print('Fast method:', time.time()-start)
36start=time.time()
37for i in range(100):
38 result1 = summation(0, n, mysum)
39print('Slow method:', time.time()-start)
40
41# method #2
42start=time.time()
43for i in range(100):
44 w=np.arange(0, n+1, dtype=np.object)
45 result2 = (w**2*np.cumsum(w)).sum()
46print('Fast method:', time.time()-start)
47Slow method: 0.06906533241271973
48Fast method: 0.008007287979125977
49# method #3
50import itertools
51start=time.time()
52for i in range(100):
53 result3 = sum(x*x * ysum for x, ysum in enumerate(itertools.accumulate(range(n+1))))
54print('Faster, pure python:', (time.time()-start))
55Faster, pure python: 0.0009944438934326172
56# method #4
57start=time.time()
58for i in range(100):
59 w = np.arange(0, n+1, dtype=np.object)
60 result2 = (w*w*np.cumsum(w)).sum()
61print('Fast method x*x:', time.time()-start)
62
Community Discussions contain sources that include Stack Exchange Network
Tutorials and Learning Resources in Wiki
Tutorials and Learning Resources are not available at this moment for Wiki