By continuing you indicate that you have read and agree to our Terms of service and Privacy policy
by dongjinyong Java Version: Current License: No License
by dongjinyong Java Version: Current License: No License
Support
Quality
Security
License
Reuse
kandi has reviewed Gossip and discovered the below as its top functions. This is intended to give you an instant insight into Gossip implemented functionality, and help decide if they suit your requirements.
Get all kandi verified functions for this library.
Get all kandi verified functions for this library.
QUESTION
Upgraded Cassandra 3.11 to 4.0, failed with "node with address ... already exists"
Asked 2022-Mar-07 at 00:15we try to upgrade apache cassandra 3.11.12 to 4.0.2, this is the first node we upgrade in this cluster (seed node). we drain the node and stop the service before replace the version.
system log:
NFO [RMI TCP Connection(16)-IP] 2022-03-03 15:50:18,811 StorageService.java:1568 - DRAINED
....
....
INFO [main] 2022-03-03 15:58:02,970 QueryProcessor.java:167 - Preloaded 0 prepared statements
INFO [main] 2022-03-03 15:58:02,970 StorageService.java:735 - Cassandra version: 4.0.2
INFO [main] 2022-03-03 15:58:02,971 StorageService.java:736 - CQL version: 3.4.5
INFO [main] 2022-03-03 15:58:02,971 StorageService.java:737 - Native protocol supported versions: 3/v3, 4/v4, 5/v5, 6/v6-beta (default: 5/v5)
...
...
WARN [main] 2022-03-03 15:58:03,328 SystemKeyspace.java:1130 - No host ID found, created d78ab047-f1f9-4a07-8118-2fa83f4571ef (Note: This should happen exactly once per node).
....
...
ERROR [main] 2022-03-03 15:58:04,543 CassandraDaemon.java:911 - Exception encountered during startup
java.lang.RuntimeException: A node with address /HOST_IP:7001 already exists, cancelling join. Use cassandra.replace_address if you want to replace this node.
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:660)
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:935)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:785)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:730)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:420)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:765)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:889)
INFO [StorageServiceShutdownHook] 2022-03-03 15:58:04,558 HintsService.java:222 - Paused hints dispatch
WARN [StorageServiceShutdownHook] 2022-03-03 15:58:04,561 Gossiper.java:2032 - No local state, state is in silent shutdown, or node hasn't joined, not announcing shutdown
INFO [StorageServiceShutdownHook] 2022-03-03 15:58:04,561 MessagingService.java:441 - Waiting for messaging service to quiesce
...
..
INFO [StorageServiceShutdownHook] 2022-03-03 15:58:06,956 HintsService.java:222 - Paused hints dispatch
did we need to delete\rm -rf system* data directories after drain the node before we start the new cassandra version (We didn't do that)? how we can solve this problem?
ANSWER
Answered 2022-Mar-07 at 00:15During startup, Cassandra tries to retrieve the host ID by querying the local system table with:
SELECT host_id FROM system.local WHERE key = 'local'
But if the system.local
table is empty or the SSTables are missing from system/local-*/
data subdirectories, Cassandra assumes that it is a brand new node and assigns a new host ID. However in your case, Cassandra realises that another node with the same IP address is already part of the cluster when it gossips with other nodes.
You need to figure out why Cassandra can't access the local system.local
table. If someone deleted system/local-*/
from the data directory, then you won't be able to start the node again. If this was the case, you'll need to start from scratch which involves:
data/
, commitlog/
and saved_caches/
You will then need to replace the node "with itself" using the replace_address
method. Cheers!
QUESTION
Mongo vs cassandra: single point of failure
Asked 2022-Mar-02 at 08:10In Cassandra vs Mongo debate, it is said that as mongo has master-slave architecture so it has a single point of failure(master) as when master fails, slave nodes take time to decide for new master, hence a window for downtime.
With Cassandra we don't have this problem as all nodes are equal. But then Cassandra too has a system wherein nodes use gossip protocol to keep themselves updated. In gossip protocol a minimum number of nodes are needed to take part. Suppose if one of the participating node goes down, then a new node needs to replace it. But it would take time to spawn a new replacement node, and this is a situation similar to master failure in mongo.
So what's the difference between 2 as far as single point of failure is concerned?
ANSWER
Answered 2022-Mar-02 at 07:11Your assumptions about Cassandra are not correct so allow me to explain.
Gossip does not require multiple nodes for it to work. It is possible to have a single-node cluster and gossip will still work so this statement is incorrect:
In gossip protocol a minimum number of nodes are needed to take part.
For best practice, we recommend 3 replicas in each data centre (replication factor of 3) so you need a minimum of 3 nodes in each data centre. With a replication factor of 3, your application can survive a node outage for consistency levels of ONE
, LOCAL_ONE
or the recommended LOCAL_QUORUM
so these statements are incorrect too:
Suppose if one of the participating node goes down, then a new node needs to replace it. But it would take time to spawn a new replacement node, and this is a situation similar to master failure in mongo.
The only ways to introduce single points-of-failure to your Cassandra cluster are:
As a side note, a friendly warning that other users may vote to close your question because comparisons are usually frowned upon since the answers are often based on opinions. Cheers!
QUESTION
What does solana-test-validator do on the background?
Asked 2022-Feb-27 at 13:18When run solana-test-validator
it begins a new process with the following output:
Ledger location: test-ledger
Log: test-ledger/validator.log
Identity: 4876NsAf6yH8c7uPXybETZPit142i2QhR7tfSoTPYjHf
Genesis Hash: CrqeHuGVmgHL54Sri7dEm2aCRLFopJrTHoQBYe6ciF7N
Version: 1.8.17
Shred Version: 28931
Gossip Address: 127.0.0.1:1024
TPU Address: 127.0.0.1:1027
JSON RPC URL: http://127.0.0.1:8899
⠄ 01:44:22 | Processed Slot: 48335 | Confirmed Slot: 48335 | Finalized Slot: 483
I understand:
I've also read Solana cluster, validator, slot, epochs docs. It says
Slot: The period of time for which each leader ingests transactions and produces a block.
Could someone explain what happens when we run solana-test-validator
? In particular, does it produce blocks/ledger entries? From what it continuously displays
⠄ 01:44:22 | Processed Slot: 48335 | Confirmed Slot: 48335 | Finalized Slot: 483
it seems to be producing new blocks? If so, why do we need those ledger entries? After all nothing happens locally on my cluster (no transactions, not sol transfer....).
ANSWER
Answered 2022-Feb-27 at 13:18To your first question the answer is Yes.
To your second question, the test-validator is a ledger node and as such, just like devnet/testnet/mainnet-beta, there is the temporal record (block) as you progress through time, whether there was something done or not.
Edits:When you start and run solana-test-validator
for the first time it will create a default ledger called test-ledger
in the directory where you started it from.
If you start the test validator again, in the same location, it will open the existing ledger. Over time the ledger may become quite large.
If you want to start with a clean ledger, you can either:
rm -rf test-ledger
or...solana-test-validator --reset
QUESTION
How to use asyncio and aioredis lock inside celery tasks?
Asked 2022-Feb-10 at 15:40So, how to run async tasks properly to achieve the goal?
What is RuntimeError: await wasn't used with future
(below), how can I fix it?
I have already tried:
1. asgirefasync_to_sync
(from asgiref https://pypi.org/project/asgiref/).
This option makes it possible to run asyncio coroutines, but retries functionality doesn't work.
2. celery-pool-asyncio(https://pypi.org/project/celery-pool-asyncio/)
Same problem as in asgiref. (This option makes it possible to run asyncio coroutines, but retries functionality doesn't work.)
3. write own async to sync decoratorI have performed try to create my own decorator like async_to_sync that runs coroutines threadsafe (asyncio.run_coroutine_threadsafe
), but I have behavior as I described above.
Also I have try asyncio.run()
or asyncio.get_event_loop().run_until_complete()
(and self.retry(...)
) inside celery task. This works well, tasks runs, retries works, but there is incorrect coroutine execution - inside async
function I cannot use aioredis.
Implementation notes:
celery -A celery_test.celery_app worker -l info -n worker1 -P gevent --concurrency=10 --without-gossip --without-mingle
transport = f"redis://localhost/9"
celery_app = Celery("worker", broker=transport, backend=transport,
include=['tasks'])
celery_app.conf.broker_transport_options = {
'visibility_timeout': 60 * 60 * 24,
'fanout_prefix': True,
'fanout_patterns': True
}
@contextmanager
def temp_asyncio_loop():
# asyncio.get_event_loop() automatically creates event loop only for main thread
try:
prev_loop = asyncio.get_event_loop()
except RuntimeError:
prev_loop = None
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
yield loop
finally:
loop.stop()
loop.close()
del loop
asyncio.set_event_loop(prev_loop)
def with_temp_asyncio_loop(f):
@functools.wraps(f)
def wrapper(*args, **kwargs):
with temp_asyncio_loop() as t_loop:
return f(*args, loop=t_loop, **kwargs)
return wrapper
def await_(coro):
return asyncio.get_event_loop().run_until_complete(coro)
@celery_app.task(bind=True, max_retries=30, default_retry_delay=0)
@with_temp_asyncio_loop
def debug(self, **kwargs):
try:
await_(debug_async())
except Exception as exc:
self.retry(exc=exc)
async def debug_async():
async with RedisLock(f'redis_lock_{datetime.now()}'):
pass
class RedisLockException(Exception):
pass
class RedisLock(AsyncContextManager):
"""
Redis Lock class
:param lock_id: string (unique key)
:param value: dummy value
:param expire: int (time in seconds that key will storing)
:param expire_on_delete: int (time in seconds, set pause before deleting)
Usage:
try:
with RedisLock('123_lock', 5 * 60):
# do something
except RedisLockException:
"""
def __init__(self, lock_id: str, value='1', expire: int = 4, expire_on_delete: int = None):
self.lock_id = lock_id
self.expire = expire
self.value = value
self.expire_on_delete = expire_on_delete
async def acquire_lock(self):
return await redis.setnx(self.lock_id, self.value)
async def release_lock(self):
if self.expire_on_delete is None:
return await redis.delete(self.lock_id)
else:
await redis.expire(self.lock_id, self.expire_on_delete)
async def __aenter__(self, *args, **kwargs):
if not await self.acquire_lock():
raise RedisLockException({
'redis_lock': 'The process: {} still run, try again later'.format(await redis.get(self.lock_id))
})
await redis.expire(self.lock_id, self.expire)
async def __aexit__(self, exc_type, exc_value, traceback):
await self.release_lock()
On my windows machine await redis.setnx(...)
blocks celery worker and it stops producing logs and Ctrl+C
doesn't work.
Inside the docker container, I receive an error. There is part of traceback:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/aioredis/connection.py", line 854, in read_response
response = await self._parser.read_response()
File "/usr/local/lib/python3.9/site-packages/aioredis/connection.py", line 366, in read_response
raise ConnectionError(SERVER_CLOSED_CONNECTION_ERROR)
aioredis.exceptions.ConnectionError: Connection closed by server.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/celery/app/trace.py", line 451, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/celery/app/trace.py", line 734, in __protected_call__
return self.run(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/celery/app/autoretry.py", line 54, in run
ret = task.retry(exc=exc, **retry_kwargs)
File "/usr/local/lib/python3.9/site-packages/celery/app/task.py", line 717, in retry
raise_with_context(exc)
File "/usr/local/lib/python3.9/site-packages/celery/app/autoretry.py", line 34, in run
return task._orig_run(*args, **kwargs)
File "/app/celery_tasks/tasks.py", line 69, in wrapper
return f(*args, **kwargs) # <--- inside with_temp_asyncio_loop from utils
...
File "/usr/local/lib/python3.9/contextlib.py", line 575, in enter_async_context
result = await _cm_type.__aenter__(cm)
File "/app/db/redis.py", line 50, in __aenter__
if not await self.acquire_lock():
File "/app/db/redis.py", line 41, in acquire_lock
return await redis.setnx(self.lock_id, self.value)
File "/usr/local/lib/python3.9/site-packages/aioredis/client.py", line 1064, in execute_command
return await self.parse_response(conn, command_name, **options)
File "/usr/local/lib/python3.9/site-packages/aioredis/client.py", line 1080, in parse_response
response = await connection.read_response()
File "/usr/local/lib/python3.9/site-packages/aioredis/connection.py", line 859, in read_response
await self.disconnect()
File "/usr/local/lib/python3.9/site-packages/aioredis/connection.py", line 762, in disconnect
await self._writer.wait_closed()
File "/usr/local/lib/python3.9/asyncio/streams.py", line 359, in wait_closed
await self._protocol._get_close_waiter(self)
RuntimeError: await wasn't used with future
celery==5.2.1
aioredis==2.0.0
ANSWER
Answered 2022-Feb-04 at 07:59Maybe it helps. https://github.com/aio-libs/aioredis-py/issues/1273
The main point is:
replace all the calls to get_event_loop to get_running_loop which would remove that Runtime exception when a future is attached to a different loop.
QUESTION
How to make a redis cluster in k8s environment using nodeport service type?
Asked 2022-Jan-26 at 07:00I have tried to make a redis cluster in k8s environment using "NodePort" type of service. More specifically, I want to compose a redis cluster across two different k8s cluster.
When I used LoadBalancer(External IP) for service type, cluster was made successfully. The problem is NodePort.
After I command redis-cli --cluster create, it stucks on "Waiting for the cluster to join"
Below is the logs of cluster create command. I deployed 4 leader pods and 4 slave pods with individual nodeport service.
root@redis-leader00-5fc546c4bd-28x8w:/data# redis-cli -a mypassword --cluster create --cluster-replicas 1 \
> 192.168.9.194:30030 192.168.9.199:30031 192.168.9.194:30032 192.168.9.199:30033 \
> 192.168.9.199:30030 192.168.9.194:30031 192.168.9.199:30032 192.168.9.194:30033
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
>>> Performing hash slots allocation on 8 nodes...
Master[0] -> Slots 0 - 4095
Master[1] -> Slots 4096 - 8191
Master[2] -> Slots 8192 - 12287
Master[3] -> Slots 12288 - 16383
Adding replica 192.168.9.199:30030 to 192.168.9.194:30030
Adding replica 192.168.9.194:30033 to 192.168.9.199:30031
Adding replica 192.168.9.199:30032 to 192.168.9.194:30032
Adding replica 192.168.9.194:30031 to 192.168.9.199:30033
M: 94bf3c6760e6b3a91c408eda497822b4961e8d82 192.168.9.194:30030
slots:[0-4095] (4096 slots) master
M: 31f4a9604b15109316f91956aa4a32b0c6952a4d 192.168.9.199:30031
slots:[4096-8191] (4096 slots) master
M: 0738d1e1a677352fc3b0b3600a67d837b795fa8a 192.168.9.194:30032
slots:[8192-12287] (4096 slots) master
M: 7dd7edbfab6952273460778d1f140b0716118042 192.168.9.199:30033
slots:[12288-16383] (4096 slots) master
S: 17e044681319d0a05bd92deeb4ead31c0cd468e2 192.168.9.199:30030
replicates 94bf3c6760e6b3a91c408eda497822b4961e8d82
S: 9c9e47ec566ac781e8e3dcb51398a27d1da71004 192.168.9.194:30031
replicates 7dd7edbfab6952273460778d1f140b0716118042
S: b8f7028b56f96565a91fdb442c94fbedcee088c2 192.168.9.199:30032
replicates 0738d1e1a677352fc3b0b3600a67d837b795fa8a
S: e4c9ffdf67e2b2ef9f840618110738358bde52d5 192.168.9.194:30033
replicates 31f4a9604b15109316f91956aa4a32b0c6952a4d
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
...........................................
The weird point is that the other redis containers received the signal but there is no progress. Below is the logs of other redis containers.
.... // Some logs for initializing redis
1:M 20 Jan 2022 06:09:12.055 * Ready to accept connections
1:M 20 Jan 2022 06:13:41.263 # configEpoch set to 5 via CLUSTER SET-CONFIG-EPOCH
I thought the communication was successful but gossip port didn't work properly. So, I modified redis.conf and set cluster-announce-bus-port but it also didn't work.
How can I compose a redis cluster using nodeport type of service?
Please refer the one of .yaml files
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-cluster-leader00
namespace: redis
labels:
app: redis-cluster
leader: "00"
data:
fix-ip.sh: |
#!/bin/sh
CLUSTER_CONFIG="/data/nodes.conf"
if [ -f ${CLUSTER_CONFIG} ]; then
if [ -z "${HOST_IP}" ]; then
echo "Unable to determine Pod IP address!"
exit 1
fi
echo "Updating my IP to ${HOST_IP} in ${CLUSTER_CONFIG}"
sed -i.bak -e "/myself/ s/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/${HOST_IP}/" ${CLUSTER_CONFIG}
fi
exec "$@"
redis.conf: |+
bind 0.0.0.0
cluster-enabled yes
cluster-require-full-coverage no
cluster-node-timeout 15000
cluster-config-file /data/nodes.conf
cluster-migration-barrier 1
appendonly no
save ""
protected-mode no
requirepass "mypassword"
masterauth "mypassword"
cluster-announce-ip 192.168.9.194
cluster-announce-port 30030
cluster-announce-bus-port 31030
---
apiVersion: v1
kind: Service
metadata:
name: redis-leader00
namespace: redis
labels:
app: redis
role: leader
tier: backend
leader: "00"
spec:
ports:
- port: 6379
targetPort: 6379
nodePort: 30030
name: client
- port: 16379
targetPort: 16379
nodePort: 31030
name: gossip
selector:
app: redis
role: leader
tier: backend
leader: "00"
type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-leader00
namespace: redis
labels:
app: redis
role: leader
tier: backend
leader: "00"
spec:
replicas: 1
selector:
matchLabels:
app: redis
leader: "00"
template:
metadata:
labels:
app: redis
role: leader
tier: backend
leader: "00"
spec:
containers:
- name: leader
image: docker.io/redis:6.0.5
resources:
requests:
cpu: 100m
memory: 100Mi
ports:
- containerPort: 6379
name: client
- containerPort: 16379
name: gossip
volumeMounts:
- name: conf
mountPath: /conf
readOnly: false
args: ["--requirepass", "mypassword"]
command: ["/conf/fix-ip.sh", "redis-server", "/conf/redis.conf"]
env:
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
volumes:
- name: conf
configMap:
name: redis-cluster-leader00
defaultMode: 0755
Also see nodes.conf file in container after I commanded redis-cli create
root@redis-leader01-87ccb466-bsnq4:/data# cat nodes.conf
31f4a9604b15109316f91956aa4a32b0c6952a4d 192.168.9.199:30031@31031 myself,master - 0 0 2 connected 4096-8191
vars currentEpoch 2 lastVoteEpoch 0
ANSWER
Answered 2022-Jan-20 at 07:00i am not sure actual process you are following to create the cluster of Redis however i would suggest checking out the helm chart to deploy the Redis cluster on K8s.
Using helm chart it's easy to manage and deploy the Redis cluster on K8s.
https://github.com/bitnami/charts/tree/master/bitnami/redis
To deploy chart you just have to run command :
helm install my-release bitnami/redis
On Node Port side once helm chart is deployed you can edit the service type of else you can also update the first helm chart and after that apply those changes to K8s.
This will create the Node port on K8s for Redis service.
You can find service template here : https://github.com/bitnami/charts/tree/master/bitnami/redis/templates
QUESTION
Correct EventStore config for a 3 node cluster?
Asked 2022-Jan-19 at 21:04So I had EventStore 5.0.7 installed as a 3 node cluster, working just fine.
I tried to upgrade to EventStore 21.10.1. The config for EventStore has changed substantially since the move from 5.x to 20.x and 21.x, and despite multiple readings of all kinds of documentation, I'm still doing something wrong.
What we see is 6 nodes appearing - each server twice - and the gossip failing, and nothing working, ie, cannot insert events.
What am I doing wrong?
EventStore 5.0.7
EventStore 21.10.1
Config for EventStore 21.10.1
---
# Paths
Db: /var/lib/eventstore
Index: /var/lib/eventstore/index
Log: /var/log/eventstore
# Run in insecure mode
Insecure: true
DisableInternalTcpTls: true
DisableExternalTcpTls: true
# Network configuration
IntIp: 172.31.47.243
ExtIp: 0.0.0.0
HttpPort: 2113
IntTcpPort: 1112
ExtTcpPort: 1113
EnableExternalTcp: true
EnableAtomPubOverHTTP: false
# Projections configuration
RunProjections: System
ClusterSize: 3
LogLevel: Verbose
LogHttpRequests: true
LogFailedAuthenticationAttempts: true
LogConfig: /etc/eventstore/logconfig.json
HttpPortAdvertiseAs: 2114
ExtHostAdvertiseAs: 54.209.234.141
IntTcpHeartbeatTimeout: 2000
ExtTcpHeartbeatTimeout: 2000
IntTcpHeartbeatInterval: 5000
ExtTcpHeartbeatInterval: 5000
GossipTimeoutMs: 5000
GossipIntervalMs: 2000
StatsPeriodSec: 900
DiscoverViaDns: false
GossipSeed: 172.31.45.192:2113,172.31.41.141:2113
Config for EventStore 21.10.1 (as seen at startup)
MODIFIED OPTIONS:
STATS PERIOD SEC: 900 (Yaml)
LOG HTTP REQUESTS: true (Yaml)
LOG FAILED AUTHENTICATION ATTEMPTS: true (Yaml)
INSECURE: true (Yaml)
LOG: /var/log/eventstore (Yaml)
LOG CONFIG: /etc/eventstore/logconfig.json (Yaml)
LOG LEVEL: Verbose (Yaml)
CLUSTER SIZE: 3 (Yaml)
DISCOVER VIA DNS: false (Yaml)
GOSSIP SEED: 172.31.46.96:2113,172.31.40.110:2113 (Yaml)
GOSSIP INTERVAL MS: 2000 (Yaml)
GOSSIP TIMEOUT MS: 5000 (Yaml)
DB: /var/lib/eventstore (Yaml)
INDEX: /var/lib/eventstore/index (Yaml)
INT IP: 172.31.35.133 (Yaml)
EXT IP: 0.0.0.0 (Yaml)
HTTP PORT: 2113 (Yaml)
ENABLE EXTERNAL TCP: true (Yaml)
INT TCP PORT: 1112 (Yaml)
EXT TCP PORT: 1113 (Yaml)
EXT HOST ADVERTISE AS: 3.82.200.231 (Yaml)
HTTP PORT ADVERTISE AS: 2114 (Yaml)
INT TCP HEARTBEAT TIMEOUT: 2000 (Yaml)
EXT TCP HEARTBEAT TIMEOUT: 2000 (Yaml)
INT TCP HEARTBEAT INTERVAL: 5000 (Yaml)
EXT TCP HEARTBEAT INTERVAL: 5000 (Yaml)
DISABLE INTERNAL TCP TLS: true (Yaml)
DISABLE EXTERNAL TCP TLS: true (Yaml)
ENABLE ATOM PUB OVER HTTP: false (Yaml)
RUN PROJECTIONS: System (Yaml)
DEFAULT OPTIONS:
HELP: False (<DEFAULT>)
VERSION: False (<DEFAULT>)
CONFIG: /etc/eventstore/eventstore.conf (<DEFAULT>)
WHAT IF: False (<DEFAULT>)
START STANDARD PROJECTIONS: False (<DEFAULT>)
DISABLE HTTP CACHING: False (<DEFAULT>)
WORKER THREADS: 0 (<DEFAULT>)
ENABLE HISTOGRAMS: False (<DEFAULT>)
SKIP INDEX SCAN ON READS: False (<DEFAULT>)
MAX APPEND SIZE: 1048576 (<DEFAULT>)
LOG CONSOLE FORMAT: Plain (<DEFAULT>)
LOG FILE SIZE: 1073741824 (<DEFAULT>)
LOG FILE INTERVAL: Day (<DEFAULT>)
LOG FILE RETENTION COUNT: 31 (<DEFAULT>)
DISABLE LOG FILE: False (<DEFAULT>)
AUTHORIZATION TYPE: internal (<DEFAULT>)
AUTHORIZATION CONFIG: <empty> (<DEFAULT>)
AUTHENTICATION TYPE: internal (<DEFAULT>)
AUTHENTICATION CONFIG: <empty> (<DEFAULT>)
DISABLE FIRST LEVEL HTTP AUTHORIZATION: False (<DEFAULT>)
TRUSTED ROOT CERTIFICATES PATH: <empty> (<DEFAULT>)
CERTIFICATE RESERVED NODE COMMON NAME: eventstoredb-node (<DEFAULT>)
CERTIFICATE FILE: <empty> (<DEFAULT>)
CERTIFICATE PRIVATE KEY FILE: <empty> (<DEFAULT>)
CERTIFICATE PASSWORD: <empty> (<DEFAULT>)
CERTIFICATE STORE LOCATION: <empty> (<DEFAULT>)
CERTIFICATE STORE NAME: <empty> (<DEFAULT>)
CERTIFICATE SUBJECT NAME: <empty> (<DEFAULT>)
CERTIFICATE THUMBPRINT: <empty> (<DEFAULT>)
STREAM INFO CACHE CAPACITY: 0 (<DEFAULT>)
NODE PRIORITY: 0 (<DEFAULT>)
COMMIT COUNT: -1 (<DEFAULT>)
PREPARE COUNT: -1 (<DEFAULT>)
CLUSTER DNS: fake.dns (<DEFAULT>)
CLUSTER GOSSIP PORT: 2113 (<DEFAULT>)
GOSSIP ALLOWED DIFFERENCE MS: 60000 (<DEFAULT>)
READ ONLY REPLICA: False (<DEFAULT>)
UNSAFE ALLOW SURPLUS NODES: False (<DEFAULT>)
DEAD MEMBER REMOVAL PERIOD SEC: 1800 (<DEFAULT>)
LEADER ELECTION TIMEOUT MS: 1000 (<DEFAULT>)
QUORUM SIZE: 1 (<DEFAULT>)
PREPARE ACK COUNT: 1 (<DEFAULT>)
COMMIT ACK COUNT: 1 (<DEFAULT>)
MIN FLUSH DELAY MS: 2 (<DEFAULT>)
DISABLE SCAVENGE MERGING: False (<DEFAULT>)
SCAVENGE HISTORY MAX AGE: 30 (<DEFAULT>)
CACHED CHUNKS: -1 (<DEFAULT>)
CHUNKS CACHE SIZE: 536871424 (<DEFAULT>)
MAX MEM TABLE SIZE: 1000000 (<DEFAULT>)
HASH COLLISION READ LIMIT: 100 (<DEFAULT>)
MEM DB: False (<DEFAULT>)
USE INDEX BLOOM FILTERS: True (<DEFAULT>)
INDEX CACHE SIZE: 0 (<DEFAULT>)
SKIP DB VERIFY: False (<DEFAULT>)
WRITE THROUGH: False (<DEFAULT>)
UNBUFFERED: False (<DEFAULT>)
CHUNK INITIAL READER COUNT: 5 (<DEFAULT>)
PREPARE TIMEOUT MS: 2000 (<DEFAULT>)
COMMIT TIMEOUT MS: 2000 (<DEFAULT>)
WRITE TIMEOUT MS: 2000 (<DEFAULT>)
UNSAFE DISABLE FLUSH TO DISK: False (<DEFAULT>)
UNSAFE IGNORE HARD DELETE: False (<DEFAULT>)
SKIP INDEX VERIFY: False (<DEFAULT>)
INDEX CACHE DEPTH: 16 (<DEFAULT>)
OPTIMIZE INDEX MERGE: False (<DEFAULT>)
ALWAYS KEEP SCAVENGED: False (<DEFAULT>)
REDUCE FILE CACHE PRESSURE: False (<DEFAULT>)
INITIALIZATION THREADS: 1 (<DEFAULT>)
READER THREADS COUNT: 0 (<DEFAULT>)
MAX AUTO MERGE INDEX LEVEL: 2147483647 (<DEFAULT>)
WRITE STATS TO DB: False (<DEFAULT>)
MAX TRUNCATION: 268435456 (<DEFAULT>)
CHUNK SIZE: 268435456 (<DEFAULT>)
STATS STORAGE: File (<DEFAULT>)
DB LOG FORMAT: V2 (<DEFAULT>)
STREAM EXISTENCE FILTER SIZE: 256000000 (<DEFAULT>)
KEEP ALIVE INTERVAL: 10000 (<DEFAULT>)
KEEP ALIVE TIMEOUT: 10000 (<DEFAULT>)
INT HOST ADVERTISE AS: <empty> (<DEFAULT>)
ADVERTISE HOST TO CLIENT AS: <empty> (<DEFAULT>)
ADVERTISE HTTP PORT TO CLIENT AS: 0 (<DEFAULT>)
ADVERTISE TCP PORT TO CLIENT AS: 0 (<DEFAULT>)
EXT TCP PORT ADVERTISE AS: 0 (<DEFAULT>)
INT TCP PORT ADVERTISE AS: 0 (<DEFAULT>)
GOSSIP ON SINGLE NODE: <empty> (<DEFAULT>)
CONNECTION PENDING SEND BYTES THRESHOLD: 10485760 (<DEFAULT>)
CONNECTION QUEUE SIZE THRESHOLD: 50000 (<DEFAULT>)
DISABLE ADMIN UI: False (<DEFAULT>)
DISABLE STATS ON HTTP: False (<DEFAULT>)
DISABLE GOSSIP ON HTTP: False (<DEFAULT>)
ENABLE TRUSTED AUTH: False (<DEFAULT>)
PROJECTION THREADS: 3 (<DEFAULT>)
PROJECTIONS QUERY EXPIRY: 5 (<DEFAULT>)
FAULT OUT OF ORDER PROJECTIONS: False (<DEFAULT>)
PROJECTION COMPILATION TIMEOUT: 500 (<DEFAULT>)
PROJECTION EXECUTION TIMEOUT: 250 (<DEFAULT>)
Gossip for EventStore 21.10.1
{
"members": [
{
"instanceId": "ed2ee047-eb59-4b11-86fd-a5b366edd0ce",
"timeStamp": "2022-01-12T23:17:42.539034Z",
"state": "Unknown",
"isAlive": true,
"internalTcpIp": "172.31.46.231",
"internalTcpPort": 1112,
"internalSecureTcpPort": 0,
"externalTcpIp": "52.91.48.59",
"externalTcpPort": 1113,
"externalSecureTcpPort": 0,
"httpEndPointIp": "52.91.48.59",
"httpEndPointPort": 2114,
"lastCommitPosition": -1,
"writerCheckpoint": 0,
"chaserCheckpoint": 0,
"epochPosition": -1,
"epochNumber": -1,
"epochId": "00000000-0000-0000-0000-000000000000",
"nodePriority": 0,
"isReadOnlyReplica": false
},
{
"instanceId": "dfcc4139-2966-454c-8cee-71261cedafba",
"timeStamp": "2022-01-12T23:17:40.0803168Z",
"state": "Unknown",
"isAlive": false,
"internalTcpIp": "172.31.46.43",
"internalTcpPort": 1112,
"internalSecureTcpPort": 0,
"externalTcpIp": "44.201.237.180",
"externalTcpPort": 1113,
"externalSecureTcpPort": 0,
"httpEndPointIp": "44.201.237.180",
"httpEndPointPort": 2114,
"lastCommitPosition": -1,
"writerCheckpoint": 0,
"chaserCheckpoint": 0,
"epochPosition": -1,
"epochNumber": -1,
"epochId": "00000000-0000-0000-0000-000000000000",
"nodePriority": 0,
"isReadOnlyReplica": false
},
{
"instanceId": "2a47929c-afd6-496f-b87b-d85904eeed18",
"timeStamp": "2022-01-12T23:17:40.539795Z",
"state": "Unknown",
"isAlive": true,
"internalTcpIp": "172.31.38.246",
"internalTcpPort": 1112,
"internalSecureTcpPort": 0,
"externalTcpIp": "3.93.17.39",
"externalTcpPort": 1113,
"externalSecureTcpPort": 0,
"httpEndPointIp": "3.93.17.39",
"httpEndPointPort": 2114,
"lastCommitPosition": -1,
"writerCheckpoint": 0,
"chaserCheckpoint": 0,
"epochPosition": -1,
"epochNumber": -1,
"epochId": "00000000-0000-0000-0000-000000000000",
"nodePriority": 0,
"isReadOnlyReplica": false
},
{
"instanceId": "00000000-0000-0000-0000-000000000000",
"timeStamp": "2022-01-12T22:39:46.4047071Z",
"state": "Manager",
"isAlive": true,
"internalTcpIp": "172.31.46.43",
"internalTcpPort": 2113,
"internalSecureTcpPort": 0,
"externalTcpIp": "172.31.46.43",
"externalTcpPort": 2113,
"externalSecureTcpPort": 0,
"httpEndPointIp": "172.31.46.43",
"httpEndPointPort": 2113,
"lastCommitPosition": -1,
"writerCheckpoint": -1,
"chaserCheckpoint": -1,
"epochPosition": -1,
"epochNumber": -1,
"epochId": "00000000-0000-0000-0000-000000000000",
"nodePriority": 0,
"isReadOnlyReplica": false
},
{
"instanceId": "00000000-0000-0000-0000-000000000000",
"timeStamp": "2022-01-12T22:53:47.9621597Z",
"state": "Manager",
"isAlive": true,
"internalTcpIp": "172.31.46.231",
"internalTcpPort": 2113,
"internalSecureTcpPort": 0,
"externalTcpIp": "172.31.46.231",
"externalTcpPort": 2113,
"externalSecureTcpPort": 0,
"httpEndPointIp": "172.31.46.231",
"httpEndPointPort": 2113,
"lastCommitPosition": -1,
"writerCheckpoint": -1,
"chaserCheckpoint": -1,
"epochPosition": -1,
"epochNumber": -1,
"epochId": "00000000-0000-0000-0000-000000000000",
"nodePriority": 0,
"isReadOnlyReplica": false
},
{
"instanceId": "00000000-0000-0000-0000-000000000000",
"timeStamp": "2022-01-12T22:53:47.9621597Z",
"state": "Manager",
"isAlive": true,
"internalTcpIp": "172.31.38.246",
"internalTcpPort": 2113,
"internalSecureTcpPort": 0,
"externalTcpIp": "172.31.38.246",
"externalTcpPort": 2113,
"externalSecureTcpPort": 0,
"httpEndPointIp": "172.31.38.246",
"httpEndPointPort": 2113,
"lastCommitPosition": -1,
"writerCheckpoint": -1,
"chaserCheckpoint": -1,
"epochPosition": -1,
"epochNumber": -1,
"epochId": "00000000-0000-0000-0000-000000000000",
"nodePriority": 0,
"isReadOnlyReplica": false
}
],
"serverIp": "52.91.48.59",
"serverPort": 2114
}
ANSWER
Answered 2022-Jan-14 at 17:24This online tool : https://configurator.eventstore.com/ should help you setup the configuration correctly
QUESTION
How can you tell if a solana node is synced?
Asked 2022-Jan-18 at 03:50I'm running a solana node using the solana-validator
command (see Solana docs).
And I'd like to know if my validator is ready to connect to the http/rpc/ws port. What's the quickest way to do check to see if it's synced?
Currently, I'm using wscat
to check to see if I can connect to the websocket, but am unable to. I'm not sure if that's because the node isn't setup right, or it's not synced, etc.
I know if I run solana gossip
I should be able to see my IP in the list that populates... but is that the best way?
ANSWER
Answered 2022-Jan-04 at 18:54Take a look at solana catchup
, which does exactly what you're asking for: https://docs.solana.com/cli/usage#solana-catchup
QUESTION
RabbitMQ, Celery and Django - connection to broker lost. Trying to re-establish the connection
Asked 2021-Dec-23 at 15:56Celery disconnects from RabbitMQ each time a task is passed to rabbitMQ, however the task does eventually succeed:
My questions are:
Celery version: 5.1.2 RabbitMQ version: 3.9.0 Erlang version: 24.0.4
RabbitMQ error (sorry for the length of the log:
** Generic server <0.11908.0> terminating
** Last message in was {'$gen_cast',
{method,{'basic.ack',1,false},none,noflow}}
** When Server state == {ch,
{conf,running,rabbit_framing_amqp_0_9_1,1,
<0.11899.0>,<0.11906.0>,<0.11899.0>,
<<"someIPAddress:45610 -> someIPAddress:5672">>,
undefined,
{user,<<"someadmin">>,
[administrator],
[{rabbit_auth_backend_internal,none}]},
<<"backoffice">>,<<"celery">>,<0.11900.0>,
[{<<"consumer_cancel_notify">>,bool,true},
{<<"connection.blocked">>,bool,true},
{<<"authentication_failure_close">>,bool,true}],
none,0,134217728,1800000,#{},1000000000},
{lstate,<0.11907.0>,true},
none,2,
{1,
{[{pending_ack,1,<<"None4">>,1627738474140,
{resource,<<"backoffice">>,queue,<<"celery">>},
2097}],
[]}},
{state,#{},erlang},
#{<<"None4">> =>
{{amqqueue,
{resource,<<"backoffice">>,queue,<<"celery">>},
true,false,none,[],<0.471.0>,[],[],[],undefined,
undefined,[],[],live,0,[],<<"backoffice">>,
#{user => <<"someadmin">>},
rabbit_classic_queue,#{}},
{false,0,false,[]}}},
#{{resource,<<"backoffice">>,queue,<<"celery">>} =>
{1,{<<"None4">>,nil,nil}}},
{state,none,5000,undefined},
false,1,
{rabbit_confirms,undefined,#{}},
[],[],none,flow,[],
{rabbit_queue_type,
#{{resource,<<"backoffice">>,queue,<<"celery">>} =>
{ctx,rabbit_classic_queue,
{resource,<<"backoffice">>,queue,<<"celery">>},
{rabbit_classic_queue,<0.471.0>,
{resource,<<"backoffice">>,queue,<<"celery">>},
#{}}}},
#{<0.471.0> =>
{resource,<<"backoffice">>,queue,<<"celery">>}}},
#Ref<0.4203289403.2328100865.106387>,false}
** Reason for termination ==
** {function_clause,
[{rabbit_channel,'-notify_limiter/2-fun-0-',
[{pending_ack,1,<<"None4">>,1627738474140,
{resource,<<"backoffice">>,queue,<<"celery">>},
2097},
0],
[{file,"src/rabbit_channel.erl"},{line,2124}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1267}]},
{rabbit_channel,notify_limiter,2,
[{file,"src/rabbit_channel.erl"},{line,2124}]},
{rabbit_channel,ack,2,[{file,"src/rabbit_channel.erl"},{line,2057}]},
{rabbit_channel,handle_method,3,
[{file,"src/rabbit_channel.erl"},{line,1343}]},
{rabbit_channel,handle_cast,2,
[{file,"src/rabbit_channel.erl"},{line,644}]},
{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1067}]},
{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,236}]}]}
crasher:
initial call: rabbit_channel:init/1
pid: <0.11908.0>
registered_name: []
exception exit: {function_clause,
[{rabbit_channel,'-notify_limiter/2-fun-0-',
[{pending_ack,1,<<"None4">>,1627738474140,
{resource,<<"backoffice">>,queue,
<<"celery">>},
2097},
0],
[{file,"src/rabbit_channel.erl"},{line,2124}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1267}]},
{rabbit_channel,notify_limiter,2,
[{file,"src/rabbit_channel.erl"},{line,2124}]},
{rabbit_channel,ack,2,
[{file,"src/rabbit_channel.erl"},{line,2057}]},
{rabbit_channel,handle_method,3,
[{file,"src/rabbit_channel.erl"},{line,1343}]},
{rabbit_channel,handle_cast,2,
[{file,"src/rabbit_channel.erl"},{line,644}]},
{gen_server2,handle_msg,2,
[{file,"src/gen_server2.erl"},{line,1067}]},
{proc_lib,wake_up,3,
[{file,"proc_lib.erl"},{line,236}]}]}
in function gen_server2:terminate/3 (src/gen_server2.erl, line 1183)
ancestors: [<0.11905.0>,<0.11903.0>,<0.11898.0>,<0.11897.0>,<0.508.0>,
<0.507.0>,<0.506.0>,<0.504.0>,<0.503.0>,rabbit_sup,
<0.224.0>]
message_queue_len: 0
messages: []
links: [<0.11905.0>]
dictionary: [{channel_operation_timeout,15000},
{process_name,
{rabbit_channel,
{<<"someIPAddress:45610 -> someIPAddress:5672">>,
1}}},
{rand_seed,
{#{jump => #Fun<rand.3.92093067>,
max => 288230376151711743,
next => #Fun<rand.5.92093067>,type => exsplus},
[262257290895536220|242201045588130196]}},
{{xtype_to_module,direct},rabbit_exchange_type_direct},
{permission_cache_can_expire,false},
{msg_size_for_gc,115}]
trap_exit: true
status: running
heap_size: 28690
stack_size: 29
reductions: 67935
neighbours:
Error on AMQP connection <0.11899.0> (someIPAddress:45610 -> someIPAddress:5672, vhost: 'backoffice', user: 'someadmin', state: running), channel 1:
{function_clause,
[{rabbit_channel,'-notify_limiter/2-fun-0-',
[{pending_ack,1,<<"None4">>,1627738474140,
{resource,<<"backoffice">>,queue,<<"celery">>},
2097},
0],
[{file,"src/rabbit_channel.erl"},{line,2124}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1267}]},
{rabbit_channel,notify_limiter,2,
[{file,"src/rabbit_channel.erl"},{line,2124}]},
{rabbit_channel,ack,2,[{file,"src/rabbit_channel.erl"},{line,2057}]},
{rabbit_channel,handle_method,3,
[{file,"src/rabbit_channel.erl"},{line,1343}]},
{rabbit_channel,handle_cast,2,
[{file,"src/rabbit_channel.erl"},{line,644}]},
{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1067}]},
{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,236}]}]}
supervisor: {<0.11905.0>,rabbit_channel_sup}
errorContext: child_terminated
reason: {function_clause,
[{rabbit_channel,'-notify_limiter/2-fun-0-',
[{pending_ack,1,<<"None4">>,1627738474140,
{resource,<<"backoffice">>,queue,<<"celery">>},
2097},
0],
[{file,"src/rabbit_channel.erl"},{line,2124}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1267}]},
{rabbit_channel,notify_limiter,2,
[{file,"src/rabbit_channel.erl"},{line,2124}]},
{rabbit_channel,ack,2,
[{file,"src/rabbit_channel.erl"},{line,2057}]},
{rabbit_channel,handle_method,3,
[{file,"src/rabbit_channel.erl"},{line,1343}]},
{rabbit_channel,handle_cast,2,
[{file,"src/rabbit_channel.erl"},{line,644}]},
{gen_server2,handle_msg,2,
[{file,"src/gen_server2.erl"},{line,1067}]},
{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,236}]}]}
offender: [{pid,<0.11908.0>},
{id,channel},
{mfargs,
{rabbit_channel,start_link,
[1,<0.11899.0>,<0.11906.0>,<0.11899.0>,
<<"someIPAddress:45610 -> someIPAddress:5672">>,
rabbit_framing_amqp_0_9_1,
{user,<<"someadmin">>,
[administrator],
[{rabbit_auth_backend_internal,none}]},
<<"backoffice">>,
[{<<"consumer_cancel_notify">>,bool,true},
{<<"connection.blocked">>,bool,true},
{<<"authentication_failure_close">>,bool,true}],
<0.11900.0>,<0.11907.0>]}},
{restart_type,intrinsic},
{shutdown,70000},
{child_type,worker}]
Non-AMQP exit reason '{function_clause,
[{rabbit_channel,'-notify_limiter/2-fun-0-',
[{pending_ack,1,<<"None4">>,1627738474140,
{resource,<<"backoffice">>,queue,<<"celery">>},
2097},
0],
[{file,"src/rabbit_channel.erl"},{line,2124}]},
{lists,foldl,3,[{file,"lists.erl"},{line,1267}]},
{rabbit_channel,notify_limiter,2,
[{file,"src/rabbit_channel.erl"},{line,2124}]},
{rabbit_channel,ack,2,
[{file,"src/rabbit_channel.erl"},{line,2057}]},
{rabbit_channel,handle_method,3,
[{file,"src/rabbit_channel.erl"},{line,1343}]},
{rabbit_channel,handle_cast,2,
[{file,"src/rabbit_channel.erl"},{line,644}]},
{gen_server2,handle_msg,2,
[{file,"src/gen_server2.erl"},{line,1067}]},
{proc_lib,wake_up,3,
[{file,"proc_lib.erl"},{line,236}]}]}'
supervisor: {<0.11905.0>,rabbit_channel_sup}
errorContext: shutdown
reason: reached_max_restart_intensity
offender: [{pid,<0.11908.0>},
{id,channel},
{mfargs,
{rabbit_channel,start_link,
[1,<0.11899.0>,<0.11906.0>,<0.11899.0>,
<<"someIPAddress:45610 -> someIPAddress:5672">>,
rabbit_framing_amqp_0_9_1,
{user,<<"someadmin">>,
[administrator],
[{rabbit_auth_backend_internal,none}]},
<<"backoffice">>,
[{<<"consumer_cancel_notify">>,bool,true},
{<<"connection.blocked">>,bool,true},
{<<"authentication_failure_close">>,bool,true}],
<0.11900.0>,<0.11907.0>]}},
{restart_type,intrinsic},
{shutdown,70000},
{child_type,worker}]
closing AMQP connection <0.11899.0> (someIPAddress:45610 -> someIPAddress:5672, vhost: 'backoffice', user: 'someadmin')
accepting AMQP connection <0.14133.0> (someIPAddress:57452 -> someIPAddress:5672)
connection <0.14133.0> (someIPAddress:57452 -> someIPAddress:5672): user 'someadmin' authenticated and granted access to vhost 'backoffice'
Celery log:
INFO/MainProcess] Task subscribe_task[aae43c55-3396-45f3-8bea-d01a66983835] received
DEBUG/MainProcess] TaskPool: Apply <function fast_trace_task at 0x7fbd03f32af0> (args:('subscribe_task', 'aae43c55-3396-45f3-8bea-d01a66983835', {'lang': 'py', 'task': 'subscribe_task', 'id': 'aae43c55-3396-45f3-8bea-d01a66983835', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': 'aae43c55-3396-45f3-8bea-d01a66983835', 'parent_id': None, 'argsrepr': "(140, 'Subscribe Confirm Email Address')", 'kwargsrepr': '{}', 'origin': 'gen20577@webserver', 'ignore_result': True, 'reply_to': '91cf548f-4e42-3870-b27f-2cc1fd3f7074', 'correlation_id': 'aae43c55-3396-45f3-8bea-d01a66983835', 'hostname': 'worker@webserver', 'delivery_info': {'exchange': '', 'routing_key': 'celery', 'priority': 0, 'redelivered': False}, 'args': [140, 'Subscribe Confirm Email Address'], 'kwargs': {}}, '[[140, "Subscribe Confirm Email Address"], {}, {"callbacks": null, "errbacks": null, "chain": null, "chord": null}]', 'application/json', 'utf-8') kwargs:{})
DEBUG/MainProcess] Closed channel #1
DEBUG/MainProcess] Closed channel #2
WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/opt/backoffice/venv/lib/python3.8/site-packages/celery/worker/consumer/consumer.py", line 326, in start
blueprint.start(self)
File "/opt/backoffice/venv/lib/python3.8/site-packages/celery/bootsteps.py", line 116, in start
step.start(parent)
File "/opt/backoffice/venv/lib/python3.8/site-packages/celery/worker/consumer/consumer.py", line 618, in start
c.loop(*c.loop_args())
File "/opt/backoffice/venv/lib/python3.8/site-packages/celery/worker/loops.py", line 81, in asynloop
next(loop)
File "/opt/backoffice/venv/lib/python3.8/site-packages/kombu/asynchronous/hub.py", line 361, in create_loop
cb(*cbargs)
File "/opt/backoffice/venv/lib/python3.8/site-packages/kombu/transport/base.py", line 235, in on_readable
reader(loop)
File "/opt/backoffice/venv/lib/python3.8/site-packages/kombu/transport/base.py", line 217, in _read
drain_events(timeout=0)
File "/opt/backoffice/venv/lib/python3.8/site-packages/amqp/connection.py", line 523, in drain_events
while not self.blocking_read(timeout):
File "/opt/backoffice/venv/lib/python3.8/site-packages/amqp/connection.py", line 529, in blocking_read
return self.on_inbound_frame(frame)
File "/opt/backoffice/venv/lib/python3.8/site-packages/amqp/method_framing.py", line 53, in on_frame
callback(channel, method_sig, buf, None)
File "/opt/backoffice/venv/lib/python3.8/site-packages/amqp/connection.py", line 535, in on_inbound_method
return self.channels[channel_id].dispatch_method(
File "/opt/backoffice/venv/lib/python3.8/site-packages/amqp/abstract_channel.py", line 143, in dispatch_method
listener(*args)
File "/opt/backoffice/venv/lib/python3.8/site-packages/amqp/connection.py", line 665, in _on_close
raise error_for_code(reply_code, reply_text,
amqp.exceptions.InternalError: (0, 0): (541) INTERNAL_ERROR
DEBUG/MainProcess] | Consumer: Restarting event loop...
DEBUG/MainProcess] | Consumer: Restarting Control...
DEBUG/MainProcess] | Consumer: Restarting Tasks...
DEBUG/MainProcess] Canceling task consumer...
DEBUG/MainProcess] | Consumer: Restarting Connection...
DEBUG/MainProcess] | Consumer: Starting Connection
DEBUG/MainProcess] Start from server, version: 0.9, properties: {'capabilities': {'publisher_confirms': True, 'exchange_exchange_bindings': True, 'basic.nack': True, 'consumer_cancel_notify': True, 'connection.blocked': True, 'consumer_priorities': True, 'authentication_failure_close': True, 'per_consumer_qos': True, 'direct_reply_to': True}, 'cluster_name': 'rabbit@webserver', 'copyright': 'Copyright (c) 2007-2021 VMware, Inc. or its affiliates.', 'information': 'Licensed under the MPL 2.0. Website: https://rabbitmq.com', 'platform': 'Erlang/OTP 24.0.4', 'product': 'RabbitMQ', 'version': '3.9.0'}, mechanisms: [b'AMQPLAIN', b'PLAIN'], locales: ['en_US']
INFO/MainProcess] Connected to amqp://someAdmin:**@webserver:5672/backoffice
Celery Service config:
[Unit]
Description=Celery Service
After=network.target
[Service]
Type=forking
User=DJANGO_USER
Group=DJANGO_USER
EnvironmentFile=/etc/conf.d/celery
WorkingDirectory=/opt/backoffice
ExecStart=/bin/sh -c '${CELERY_BIN} -A $CELERY_APP multi start $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
--loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
ExecStop=/bin/sh -c '${CELERY_BIN} multi stopwait $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --loglevel="${CELERYD_LOG_LEVEL}"'
ExecReload=/bin/sh -c '${CELERY_BIN} -A $CELERY_APP multi restart $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
--loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
Restart=always
[Install]
WantedBy=multi-user.target
Celery conf.d:
CELERYD_NODES="worker"
CELERY_BIN="/opt/backoffice/venv/bin/celery"
CELERY_APP="backoffice"
CELERYD_CHDIR="/opt/backoffice/"
CELERYD_MULTI="multi"
CELERYD_OPTS="--time-limit=300 --without-heartbeat --without-gossip --without-mingle"
CELERYD_PID_FILE="/var/run/celery/%n.pid"
CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
CELERYD_LOG_LEVEL="DEBUG"
CELERYBEAT_PID_FILE="/var/run/celery/beat.pid"
CELERYBEAT_LOG_FILE="/var/log/celery/beat.log"
CELERYBEAT_DB_FILE="/var/cache/backoffice/celerybeat/celerybeat-schedule.db"
Django celery_config.py:
from django.conf import settings
broker_url = settings.RABBITMQ_BROKER
worker_send_task_event = False
task_ignore_result = True
task_time_limit = 60
task_soft_time_limit = 50
task_acks_late = True
worker_prefetch_multiplier = 10
worker_cancel_long_running_tasks_on_connection_loss = True
ANSWER
Answered 2021-Aug-02 at 07:25Same problem here. Tried different settings but with no solution.
Workaround: Downgrade RabbitMQ to 3.8. After downgrading there were no connection errors anymore. So, I think it must have something to do with different behavior of v3.9.
QUESTION
Alert manager in prometheus not starting
Asked 2021-Nov-13 at 20:20i configured prometheus alertmanager no error in installation but systemctl status alertmanager.service gives
# systemctl status alertmanager.service
● alertmanager.service - Alertmanager for prometheus
Loaded: loaded (/etc/systemd/system/alertmanager.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2021-11-12 07:15:08 UTC; 4min 50s ago
Process: 1791 ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml --storage.path=/opt/alertmanager/data (code=exited, status=1/FAILUR>
Main PID: 1791 (code=exited, status=1/FAILURE)
Nov 12 07:15:08 localhost systemd[1]: alertmanager.service: Scheduled restart job, restart counter is at 5.
Nov 12 07:15:08 localhost systemd[1]: Stopped Alertmanager for prometheus.
Nov 12 07:15:08 localhost systemd[1]: alertmanager.service: Start request repeated too quickly.
Nov 12 07:15:08 localhost systemd[1]: alertmanager.service: Failed with result 'exit-code'.
Nov 12 07:15:08 localhost systemd[1]: Failed to start Alertmanager for prometheus.
my systemd service file for alertmanager.service is
[Unit]
Description=Alertmanager for prometheus
[Service]
Restart=always
User=prometheus
ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml --storage.path=/opt/alertmanager/data
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no
[Install]
WantedBy=multi-user.target
ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml --storage.path=/opt/alertmanager/data (code=exited, status=1/FAILUR>
the logs says
Nov 12 13:27:01 localhost alertmanager[1563]: level=warn ts=2021-11-12T13:27:01.483Z caller=cluster.go:177 component=cluster err="couldn't deduce an advertise address: no private IP found, explicit advertise addr not provided"
Nov 12 13:27:01 localhost alertmanager[1563]: level=error ts=2021-11-12T13:27:01.485Z caller=main.go:250 msg="unable to initialize gossip mesh" err="create memberlist: Failed to get final advertise address: No private IP address found, and explicit IP not provided"
Nov 12 13:27:01 localhost systemd[1]: alertmanager.service: Main process exited, code=exited, status=1/FAILURE
Nov 12 13:27:01 localhost systemd[1]: alertmanager.service: Failed with result 'exit-code'.
any clues to resolve the issues
ANSWER
Answered 2021-Nov-13 at 06:47Do you want to run AlertManager in HA mode? It's enabled by default and requires an instance with RFC-6980 IP address.
You can specify this address with the flag alertmanager --cluster.advertise-address=<ip>
Otherwise disable HA with the specifying empty value for the flag: alertmanager --cluster.listen-address=
QUESTION
How to pass broker_url from Django settings.py to a Celery service
Asked 2021-Nov-03 at 03:13I have Celery running as a service on Ubuntu 20.04 with RabbitMQ as a broker.
Celery repeatedly restarts because it cannot access the RabbitMQ url (RABBITMQ_BROKER), a variable held in a settings.py outside of the Django root directory.
The same happens if I try to initiate celery via command line.
I have confirmed that the variable is accessible from within Django from a views.py print statement.
If I place the RABBITMQ_BROKER variable inside the settings.py within the Django root celery works.
My question is, how do I get celery to recognise the variable RABBITMQ_BROKER when it is placed in /etc/opt/mydjangoproject/settings.py?
My celery.py file:
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mydjangoproject.settings')
app = Celery('mydjangoproject')
default_config = 'mydjangoproject.celery_config'
app.config_from_object(default_config)
app.autodiscover_tasks()
My celery_config.py file:
from django.conf import settings
broker_url = settings.RABBITMQ_BROKER
etc...
The settings.py in /etc/opt/mydjangoproject/ (non relevant stuff deleted):
from mydangoproject.settings import *
RABBITMQ_BROKER = 'amqp://rabbitadmin:somepassword@somepassword@webserver:5672/mydangoproject'
etc...
My /etc/systemd/system/celery.service file:
[Unit]
Description=Celery Service
After=network.target
[Service]
Type=forking
User=DJANGO_USER
Group=DJANGO_USER
EnvironmentFile=/etc/conf.d/celery
WorkingDirectory=/opt/mydjangoproject
ExecStart=/bin/sh -c '${CELERY_BIN} -A $CELERY_APP multi start $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
--loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
ExecStop=/bin/sh -c '${CELERY_BIN} multi stopwait $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --loglevel="${CELERYD_LOG_LEVEL}"'
ExecReload=/bin/sh -c '${CELERY_BIN} -A $CELERY_APP multi restart $CELERYD_NODES \
--pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} \
--loglevel="${CELERYD_LOG_LEVEL}" $CELERYD_OPTS'
Restart=always
[Install]
WantedBy=multi-user.target
My /etc/conf.d/celery file:
CELERYD_NODES="worker"
CELERY_BIN="/opt/mydjangoproject/venv/bin/celery"
CELERY_APP="mydjangoproject"
CELERYD_CHDIR="/opt/mydjangoproject/"
CELERYD_MULTI="multi"
CELERYD_OPTS="--time-limit=300 --without-heartbeat --without-gossip --without-mingle"
CELERYD_PID_FILE="/run/celery/%n.pid"
CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
CELERYD_LOG_LEVEL="INFO"
ANSWER
Answered 2021-Nov-02 at 12:57Add the following line to the end of /etc/opt/mydjangoproject/settings.py
to have celery pick up the correct broker url (casing might vary based on the version of celery you are using):
BROKER_URL = broker_url = RABBITMQ_BROKER
This will put the configuration in a place where it will be read by the call to celery's config_from_object
function.
Next, you will also have to add an environment variable to your systemd unit. Since you are accessing settings as mydjangoproject.settings
, you have to make the parent of the mydjangoproject
directory accessible in the PYTHONPATH:
Environment=PYTHONPATH=/etc/opt
The PYTHONPATH
provides python a list of directories to try when trying the import. However, because we have two different directories with the same name that we are using as a single package, we also have to add the following lines to /etc/opt/mydjangoproject/__init__.py
and /opt/mydjangoproject/__init__.py
:
import pkgutil
__path__ = pkgutil.extend_path(__path__, __name__)
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
No vulnerabilities reported
Find more information at:
Save this library and start creating your kit
HTTPS
https://github.com/dongjinyong/Gossip.git
CLI
gh repo clone dongjinyong/Gossip
SSH
git@github.com:dongjinyong/Gossip.git
Share this Page
by apache
by apache
by zendesk
by dpkp
by spring-projects
See all Pub Sub Libraries
by mycila
by leonlee
by dongjinyong
by bennidi
by StephenAsherson
See all Pub Sub Libraries
by xstevens
by spring-guides
by pentaho
by mycila
by reta
See all Pub Sub Libraries
by spring-guides
by reta
by jcustenborder
by cloudera
by square
See all Pub Sub Libraries
by ralscha
by epam
by idealista
by wqking
by eleventigers
See all Pub Sub Libraries
Save this library and start creating your kit
Open Weaver – Develop Applications Faster with Open Source