MaxScale

Author	SHA1	Message	Date
Niclas Antti	f88f0ffe47	MXS-173 switched to use maxscale::Worker::delayed_call() Using delayed_call rather than usleep. This caused a fair amount of changes to the timing ascpects (or delaying). Also some other small changes; more config and all durations in milliseconds.	2018-05-18 16:24:45 +03:00
Johan Wikman	42c10cfa1c	MXS-1848 Move Worker from internal to public include dir maxscale::Worker needs to be public if monitors should be implementable using it.	2018-05-14 10:10:18 +03:00
Johan Wikman	13eecfbaa3	MXS-1809 Cancel all delayed calls when worker is going down Unless the calls are canceled and deleted, there will be a leak.	2018-05-08 16:01:27 +03:00
Johan Wikman	cbd7e51dd8	MXS-1754 Identify delayed calls using id and not tag When a delayed call is scheduled for execution, the caller is now returned a unique id using which the delayed call can be cancelled.	2018-04-23 13:58:00 +03:00
Johan Wikman	51d41b312b	MXS-1754 Implement delayed call cancellation There's now double bookkeeping: - All delayed calls are in a map whose key is the next invocation time. Since it's a map (and not an unordered_map) it's sorted just the way we want to have it. - In addition, there's an unordered set for each tag. With this arrangement we can easily invoke the delayed calls in the right order and be able to efficiently remove all delayed calls related to a particular tag.	2018-04-23 13:58:00 +03:00
Johan Wikman	cb3a98dee8	MXS-1754 Use std::multimap instead of std::priority_queue When canceling, a DelayedCall instance must be removed from the collection holding all delayed calls. Consequently priority_queue cannot be used as it 1) does not provide access to the underlying collection and 2) the underlying collection (vector or deque) is a bad choise if items in the middle needs to be removed.	2018-04-23 13:58:00 +03:00
Johan Wikman	a84e369a97	MXS-1754 Use signed types for expressing milliseconds	2018-04-23 13:58:00 +03:00
Johan Wikman	be9504ac94	MXS-1754 Add possibility to cancel delayed calls The interface for canceling calls is now geared towards the needs of sessions. Basically the idea is as follows: class MyFilterSession : public maxscale::FilterSession { ... int MyFilterSession::routeQuery(GWBUF* pPacket) { ... if (needs_to_be_delayed()) { Worker* pWorker = Worker::current(); void* pTag = this; pWorker->delayed_call(5000, pTag, this, &MyFilterSession::delayed_routeQuery, pPacket); return 1; } ... } bool MyFilterSession::delayed_routeQuery(Worker::Call:action_t action, GWBUF* pPacket) { if (action == Worker::Call::EXECUTE) { routeQuery(pPacket); } else { ss_dassert(action == Worker::Call::CANCEL); gwbuf_free(pPacket); } return false; } ~MyFilterSession() { void* pTag = this; Worker::current()->cancel_delayed_calls(pTag); } } The alternative, returning some key that the caller must keep around seems more cumbersome for the general case.	2018-04-23 13:58:00 +03:00
Johan Wikman	84b2156508	MXS-1754 Add delayed calling to Worker It's now possible to provide Worker with a function to call at a later time. It's possible to provide a function or a member function (with the object), taking zero or one argument of any kind. The argument must be copyable. There's currently no way to cancel a call, which must be added as typically the delayed calling is associated with a session and if the session is closed before the delayed call is made, bad things are likely to happen.	2018-04-23 13:58:00 +03:00
Johan Wikman	bf7d3f7594	MXS-1754 Add Worker::Timer class Worker::Timer class and Worker::DelegatingTimer templates are timers built on top of timerfd_create(2). As such they consume descriptor and hence cannot be created independently for each timer need. Each Worker has now a private timer member variable on top of which a general timer mechanism will be provided.	2018-04-23 13:58:00 +03:00
Johan Wikman	c011b22046	MXS-1754 Enable workers other than routing workers The maximum number of workers and routing workers are now hardwired to 128 and 100, respectively. It is still so that all workers must be created at startup and destroyed at shutdown, creating/destorying workers at runtime is not possible.	2018-04-23 13:58:00 +03:00
Markus Mäkelä	c90cfc0bee	MXS-1782: Add missing thread information The load averages and open/total file descriptor counts were missing from the REST API.	2018-04-17 09:43:56 +03:00
Johan Wikman	b36f6faa7e	MXS-1754 Reintroduce maxscale::Worker Worker is now the base class of all workers. It has a message queue and can be run in a thread of its own, or in the calling thread. Worker can not be used as such, but a concrete worker class must be derived from it. Currently there is only one concrete class RoutingWorker. There is some overlapping in functionality between Worker and RoutingWorker, as there is e.g. a need for broadcasting a message to all routing workers, but not to other workers. Currently other workers can not be created as the array for holding the pointers to the workers is exactly as large as there will be RoutingWorkers. That will be changed so that the maximum number of threads is hardwired to some ridiculous value such as 128. That's the first step in the path towards a situation where the number of worker threads can be changed at runtime.	2018-04-16 14:53:08 +03:00
Johan Wikman	230876cd69	MXS-1754 Rename mxs::Worker to mxs::RoutingWorker A new class mxs::Worker will be introduced and mxs::RoutingWorker will be inherited from that. mxs::Worker will basically only be a thread with a message-loop. Once available, all current non-worker threads (but the one implicitly created by microhttpd) can be creating by inheriting from that; in practice that means the housekeeping thread, all monitor threads and possibly the logging thread. The benefit of this arrangement is that there then will be a general mechanism for cross thread communication without having to use any shared data structures.	2018-04-16 14:53:08 +03:00
Markus Mäkelä	b33f464eea	MXS-1506: Make heartbeat reads atomic The old hkheartbeat variable was changed to the mxs_clock() function that simply wraps an atomic load of the variable. This allows it to be correctly read by MaxScale as well as opening up the possibility of converting the value load to a relaxed memory order read. Renamed the header and associated macros. Removed inclusion of the heartbeat header from the housekeeper header and added it to the files that were missing it.	2018-04-10 15:29:29 +03:00
Markus Mäkelä	0fde6e501d	Don't treat EINTR as an error When the epoll_wait call returns with an error and errno is set to EINTR, no warning should be logged as this is correct behavior.	2018-03-20 13:42:25 +02:00
Johan Wikman	885d0af50f	Merge branch '2.2' into develop	2018-03-09 21:00:16 +02:00
Johan Wikman	3e8f51bbf3	MXS-1705 Make note of workers only after creation The variable containing the number of workers must be updated only after the workers have been successfully created. Failure to do this led to crash in Worker::shutdown_all() if a terminating signal was received after the worker initialization had failed.	2018-03-08 15:00:28 +02:00
Johan Wikman	6e9e83ccaf	MXS-1674 Change load granularity to 1 second With a granularity of 1 second, the load will from a human perspective reflect the current situation. That also means that the maxadmin output shows "natural" steps; 1s, 1m and 1h.	2018-02-21 13:05:58 +02:00
Johan Wikman	fd4fd4eead	MXS-1674 Add worker load calculation By definition, the load is calculated using the following formula: L = 100 * ((T - t) / T) where T is a time period and t the time of that period that the worker spends in epoll_wait(). So, if there is so much work that epoll_wait() always returns immediately, then the load is 100 and if the thread spends the entire period in epoll_wait(), then the load is 0. The basic idea is that the timeout given to epoll_wait() is adjusted so that epoll_wait() will always return roughly at 10 seconds interval. By making a note of when we are about to enter epoll_wait() and when we return from it, we have all the information we need for calculating the load. Due to the nature of things, we will not be able to calculate the load at exact 10-second boundaries, but it will be pretty close. And the load is always calculated using the true length of the period. We will then calculate 1 minute load by averaging the load value for 6 consecutive 10-second periods and the 1 hour load by averaging the load value of 60 consecutive 1 minute loads. So, while the 10-second load represents the load of the most recently measured 10-second period (and not the load of the most recent 10 seconds), the 1 minute load and the 1 hour load represents the load of the most recent minute and hour respectively.	2018-02-20 09:18:43 +02:00
Johan Wikman	099b976773	MXS-1646 Remove non-blocking/timed epoll_wait calls Since a shutdown message will now be sent via the regular epoll route, there is no need to regularily wake up from epoll in order to check whether shutdown has been initiated, but we can simply wait in epoll_wait until told to wake up.	2018-02-05 10:35:14 +02:00
Johan Wikman	11b0f84b8e	MXS-1623 Maintain count of current/total descriptors	2018-01-26 10:25:19 +02:00
Johan Wikman	990ca48ddc	Reduce worker output Some MXS_NOTICE changed into MXS_INFO	2018-01-09 15:28:16 +02:00
Markus Mäkelä	fb1875c61c	Pre-load users for all threads Pre-loading users for all threads at startup significantly reduces the chance for failures caused by the lazy initialization of the user database done by the authenticators. If users are not loaded at startup and the connection limit for all servers is reached, authentication in MaxScale will fail not due to too many connections but due to the lack of authentication data. This causes repeated reloading of users, which floods the log with messages, and unnecessary stress on the cluster itself.	2017-12-22 11:45:32 +02:00
Markus Mäkelä	396b81f336	Fix in-source builds The internal header directory conflicted with in-source builds causing a build failure. This is fixed by renaming the internal header directory to something other than maxscale. The renaming pointed out a few problems in a couple of source files that appeared to include internal headers when the headers were in fact public headers. Fixed maxctrl in-source builds by making the copying of the sources optional.	2017-11-22 18:40:18 +02:00
Markus Mäkelä	d00c5b2838	Move thread initialization into Worker::run By moving the initialization into Worker::run, all threads, including the main thread, are properly initialized. This was not noticed before as qc_sqlite initialized the main thread in the process initialization callback.	2017-09-15 18:08:49 +03:00
Johan Wikman	402b27ad01	MXS-1392 Remove remnats of DCB reference counting	2017-09-08 12:41:41 +03:00
Johan Wikman	70a4ad6532	MXS-1392 Take new zombie mechanism into use Next commit will remove the remnants of the reference counting mechanism.	2017-09-08 12:41:41 +03:00
Johan Wikman	8414ce6e80	MXS-1392 Re-introduce zombie queue - Extend Worker interface so that zombies can be registered - Call deletion function at the end of event loop	2017-09-08 12:41:41 +03:00
Johan Wikman	7e17e2cd56	MXS-1392 Add reference count to MXS_POLL_DATA The polling mechanism can now optionally be used for managing the lifetime of an object placed into the poll set. If a MXS_POLL_DATA has a non-null 'free', then the reference count of the data will be increased before calling the handler and decreased after. In that case, if the reference count reaches 0, the free function will be called. Note that the reference counts of all MXS_POLL_DATAs returned by 'epoll_wait' will be increased before the events are delivered to the handlers individually for each MXS_POLL_DATA, and then once all events have been delivered, the reference count of each MXS_POLL_DATA will be decreased. This ensure that if there are interdependencies between different MXS_POLL_DATAs returned by one call to 'epoll_wait', the case that an MXS_POLL_DATA is deleted before its events have been delivered can be avoided by using the reference count for lifetime management. In subsequent commits, the reference count will be taken into use in the lifetime management of DCBs.	2017-09-08 12:41:41 +03:00
Johan Wikman	9c25e6d995	MXS-1376 All zombie related code removed As dcbs are now closed when dcb_close() is called and there is no zombie queue, the zombie state can also be removed.	2017-08-25 14:48:16 +03:00
Johan Wikman	49ba1f8fa0	MXS-1360 Add 'thread_stack_size' configuration value Using the 'thread_stack_size' configuration value, the stack size of the worker threads can be adjusted.	2017-08-14 15:31:00 +03:00
Johan Wikman	e9b2a560b8	MXS-1360 Make it possible to specify thread stack size It is now possible to specify the thread stack size to be used, when a new thread is created. This will subsequently be used for allowing the stack size to be specified for worker threads.	2017-08-14 15:24:16 +03:00
Johan Wikman	f546a17e77	Update change date of 2.2	2017-06-01 10:24:20 +03:00
Esa Korhonen	dbfd631fed	Change session registry to a template class The template class wraps a HashMap such that only a few operations are allowed. Usage requires specializing a RegistryTraits class template for each entry type.	2017-05-19 10:16:37 +03:00
Markus Mäkelä	157b7f853b	MXS-1220: Fix resource self links The top level resource self links pointed to the collection instead of the resource itself. The individual resoures now also have a links field that contains the self link to the resource. This should make navigation of the API easier as all objects have valid links in them.	2017-05-09 15:32:42 +03:00
Markus Mäkelä	0e57bec4ef	MXS-1220: Add threads resource The threads are now a REST API resource exposed via the /maxscale/threads resource collection.	2017-05-09 15:32:42 +03:00
Esa Korhonen	ee20191645	KILL [CONNECTION \| QUERY] support, part2B Various small changes to part2, as suggested by comments and otherwise. Mostly renaming, working logic should not change. Exception: session id changed to 64bit in the container and associated functions. Another commit will change it to 64bit in the session itself.	2017-05-08 09:58:02 +03:00
Esa Korhonen	17f6e94cba	KILL [CONNECTION \| QUERY] support, part2 MySQL sessions are added to a hasmap when created, removed when closed. MYSQL_COM_PROCESS_KILL is now detected, the thread_id is read and the kill command sent to all worker threads to find the correct session. If found, a fake hangup even is created for the client dcb. As is, this function is of little use since the client could just disconnect itself instead. Later on, additional commands of this nature will be added.	2017-05-08 09:51:07 +03:00
Markus Mäkelä	dd8a10f466	Remove old polling message system The old polling message system is obsolete now that the worker messages are implemented. The old system was only used to clean up the persistent connection pool of a server.	2017-05-03 14:16:35 +03:00
Markus Mäkelä	4c6e0c75a5	Allow execution of tasks from within worker threads It is now possible to define whether tasks are executed immediately or put on the event queue of the worker thread. If task execution is in automatic mode and the current executing thread is a worker thread, the Task->execute method is called immediately. This allows tasks to be posted from within worker threads. This is intended to be used when purging of stale persistent connections and printing diagnostic output via MaxAdmin. All of these actions are done from within a worker thread.	2017-05-03 14:03:23 +03:00
Johan Wikman	00b6c10089	Adjust Worker terminology - Posting a task to a worker for execution (without implicit wait) is called "post". - Posting a task to every worker for execution (without implicit wait) is called "broadcast". In these cases the task must be provided as a pointer or auto_ptr, to indicate that the provided pointer must remain alive for longer than the duration of the function call. - Posting a task to a worker for execution and waiting for all workers to have executed the task is called "execute" and the two variants are now called "execute_concurrently" and "execute_serially". In these cases the task is provided as a reference, since the functions will return only when all workers have (in concurrent or serial fashion) executed the task. That is, it need not remain alive for longer than the duration of the function call.	2017-05-02 11:32:08 +03:00
Johan Wikman	1b58a75f42	Add concurrent execution helper to Worker Concurrently executing a task on all workers and waiting until all workers have executed the task seems to be common enough to warrant a helper function for that purpose.	2017-05-02 10:54:29 +03:00
Johan Wikman	728c780187	Expose current worker id to c-files	2017-04-26 13:17:23 +03:00
Johan Wikman	48ed7792a5	Arrange so that startup connections are handled by main Worker When the Worker mechanism has been initialized the current_worker_id of the calling thread is set to 0. That way, connections can be created after Worker::init() has been called, but before the workers have been started. Such connections will be handled by the worker that is running in the main thread.	2017-04-26 10:57:50 +03:00
Markus Mäkelä	963ff0216d	Allow serial execution of worker tasks The Worker::execute_on_all_wait is intended to be used with dcb_foreach which expects a single-threaded context for its function.	2017-04-25 15:04:42 +03:00
Johan Wikman	55011c2951	Add safety check and rename ref mgmt functions	2017-04-25 13:10:01 +03:00
Johan Wikman	8174690f77	Introduce concept of Worker tasks A Worker::Task is an object that can be sent to a worker for execution. The task is sent to the worker using the messaging mechanism where the `execute` function of the task will be called in the thread context of the worker. There are two kinds of tasks; regular tasks and disposable tasks. The former are just sent to the worker for execution while the latter are sent and subsequently disposed of, once the task has been executed. A disposable task can be sent to either one worker or to all workers. In the latter case, the task will be deleted once it has been executed by all workers. A semaphore can be associated with a regular task. Once the task has been executed by the worker, the semaphore will automatically be posted. That way, it is trivial to send a task for execution to a worker and wait until the task has been executed. For instance: Semaphore sem; MyTask task; pWorker->execute(&task, &sem); sem.wait(); const MyResult& result = task.result(); The low level mechanism for posting and broadcasting messages will be removed.	2017-04-24 14:52:54 +03:00
Johan Wikman	172cdbc5a3	Update comment regarding use of level-triggered events	2017-04-24 10:51:45 +03:00
Johan Wikman	56b411aea9	Add shared epoll instance to workers All workers share an epoll instance that is added level-triggered to the epoll instance of each Worker. This is intended to be used together with listening sockets. When a listening socket is added to the shared epoll instance the effect is that EPOLLIN will be active for it whenever there is a connection pending on a listening socket added to that epoll instance. When that occurs all workers in their epoll_wait()-calls will return. When the workers subsequently call epoll_wait() on the shared epoll instance, that will return with an event provided some other thread(s) has not yet called accept() on the listening socket. As each worker extracts just one event at a time and calls accept just once before calling epoll_wait(), it means that the client connections will be distributed evenly across all workers, provided the load on the workers is roughly the same. If it isn't then a worker with less load will get more connections to handle (which will even out the load).	2017-04-21 21:24:20 +03:00

1 2

67 Commits